Replica amplification of nucleic acid arrays

ABSTRACT

Disclosed are improved methods of making and using immobilized arrays of nucleic acids, particularly methods for producing replicas of such arrays. Included are methods for producing high density arrays of nucleic acids and replicas of such arrays, as well as methods for preserving the resolution of arrays through rounds of replication. Also included are methods which take advantage of the availability of replicas of arrays for increased sensitivity in detection of sequences on arrays. Improved methods of sequencing nucleic acids immobilized on arrays utilizing single copies of arrays and methods taking further advantage of the availability of replicas of arrays are disclosed. The improvements lead to higher fidelity and longer read lengths of sequences immobilized on arrays. Methods are also disclosed which improve the efficiency of multiplex PCR using arrays of immobilized nucleic acids.

[0001] This application was funded by DOE Grant No. DEFG02-87ER-60565and is a continuation-in-part of U.S. patent application Ser. No.09/267,496, filed Mar. 12, 1999, which in turn is a continuation-in-partof U.S. patent application Ser. No.09/143,014, filed Aug. 28, 1998. Theapplication claims the benefit of U.S. Provisional ApplicationNo.60/076,570, Mar. 2, 1998 and U.S. Provisional ApplicationNo.60/061,511, filed Oct. 10, 1997.

FIELD OF THE INVENTION

[0002] The invention relates in general to the reproducible,mass-production of nucleic acid arrays. The invention also relates tomethods of sequencing nucleic acids on arrays.

BACKGROUND OF THE INVENTION

[0003] Arrays of nucleic acid molecules are of enormous utility infacilitating methods aimed at genomic characterization (such aspolymorphism analysis and high-throughput sequencing techniques),screening of clinical patients or entire pedigrees for the risk ofgenetic disease, elucidation of protein/DNA- or protein/proteininteractions or the assay of candidate pharmaceutical compounds forefficacy; however, such arrays are both labor-intensive and costly toproduce by conventional methods. Highly ordered arrays of nucleic acidfragments are known in the art (Fodor et al., U.S. Pat. No. 5,510,270;Lockhart et al., U.S. Pat. No. 5,556,752). Chetverin and Kramer (WO93/17126) are said to disclose a highly ordered array which may beamplified.

[0004] U.S. Pat. No. 5,616,478 of Chetverin and Chetverina reportedlyclaims methods of nucleic acid amplification, in which pools of nucleicacid molecules are positioned on a support matrix to which they are notcovalently linked. Utermohlen (U.S. Pat. No. 5,437,976) is said todisclose nucleic acid molecules randomly immobilized on a reusablematrix.

[0005] There is need in the art for improved methods of nucleic acidarray design and production. There is also a need in the art for methodswith improved resolution and/or sensitivity for detection of sequenceson nucleic acid arrays. There is also a need in the art for improvedmethods of sequencing the molecules on nucleic acid arrays.

SUMMARY OF THE INVENTION

[0006] The invention provides a method of producing a high density arrayof immobilized nucleic acid molecules, such method comprising the stepsof: 1) creating an array of spots of a nucleic acid capture activitysuch that the spots of said capture activity are separated by a distancegreater than the diameter of the spots, and the size of the spots isless than the diameter of the excluded volume of the nucleic acidmolecule to be captured; 2) contacting the array of spots of nucleicacid capture activity with an excess of nucleic acid molecules with anexcluded volume diameter greater than the diameter of the spots ofnucleic acid capture activity, resulting in an immobilized array ofnucleic acid molecules in which each spot of nucleic acid captureactivity can bind only one nucleic acid molecule with an excluded volumediameter greater than the size of said spots of nucleic acid captureactivity.

[0007] In a preferred embodiment of the invention, the nucleic acidcapture activity may be a hydrophobic compound, an oligonucleotide, anantibody or fragment of an antibody, a protein, a peptide, anintercalator, biotin, avidin, or streptavidin.

[0008] In another embodiment of the invention the immobilized array ofspots of a nucleic acid capture activity are arranged in a predeterminedgeometry.

[0009] In another embodiment, the immobilized spots of a nucleic acidcapture activity are aligned with other microfabricated features.

[0010] The invention also encompasses a method of making a plurality ofa high-density nucleic acid array made using spots of nucleic acidcapture activity as described above.

[0011] The invention provides a method for the detection of a nucleicacid on an array of nucleic acid molecules, such method comprising thesteps of generating a plurality of a nucleic acid molecule array whereinthe nucleic acid molecules of each member of said plurality occupypositions which correspond to those positions occupied by the nucleicacid molecules of each other member of said plurality of a nucleic acidarray, and subjecting one or more members of said plurality, but atleast one less than the total number of said plurality to a method ofsignal detection comprising a signal amplification method which renderssaid member of said plurality of a nucleic acid array non-reusable.

[0012] It is preferred that the signal amplification method comprisesfluorescence measurement.

[0013] In a preferred embodiment the method of detection of a nucleicacid on an array of nucleic acid molecules detects the amount of an RNAexpressed in a first RNA-containing nucleic acid population relative tothat expressed in a second RNA-containing nucleic acid population. Themethod further comprises the steps of preparing a first population offluorescently labeled cDNA using said first population of RNA containingnucleic acid as a template, preparing a second fluorescently labeledcDNA population using said second population of RNA-containing nucleicacid as a template, said second fluorescently labeled cDNA populationbeing labeled with a fluorescent label distinguishable from that used tolabel said first population, contacting a mixture of said firstfluorescently labeled cDNA population and said second fluorescentlylabeled cDNA population with a member of said plurality of nucleic acidarrays under conditions which permit hybridization of said fluorescentlylabeled cDNA populations with nucleic acids immobilized on said membersof said plurality of nucleic acid arrays and detecting the fluorescenceof said first fluorescently labeled population of cDNA and thefluorescence of said second fluorescently labeled population of cDNAhybridized to said member of said plurality of nucleic acid arrays,wherein the relative amount of said first fluorescent label and saidsecond fluorescent label detected on a given nucleic acid feature ofsaid array indicates the relative level of expression of RNA derivedfrom the nucleic acid of that feature in the mRNA-containing cDNApopulations tested.

[0014] In another embodiment the method of detection of a nucleic acidon an array of nucleic acid molecules detects the amount of an RNAexpressed in a first RNA-containing nucleic acid population relative tothat expressed in a second RNA-containing nucleic acid population. Themethod further comprises the steps of preparing a first population offluorescently labeled cDNA using said first population of RNA containingnucleic acid as a template, preparing a second fluorescently labeledcDNA population using said second population of RNA-containing nucleicacid as a template, contacting said first fluorescently labeled cDNApopulation with one member of a plurality of immobilized nucleic acidarrays under conditions which permit hybridization of said fluorescentlylabeled cDNA population with nucleic acid immobilized on said member ofa plurality of immobilized nucleic acid arrays, contacting said secondfluorescently labeled cDNA population with another member of the sameplurality of immobilized nucleic acid arrays under conditions whichpermit hybridization of said fluorescently labeled cDNA populations withnucleic acid immobilized on said members of a plurality of immobilizednucleic acid arrays, detecting the intensity of fluorescence on eachmember of said plurality contacted with a fluorescently labeled cDNApopulation, and comparing the intensity of fluorescence detected on eachmember of said plurality of immobilized nucleic acid arrays so tested,to determine the relative expression of mRNA derived from those nucleicacids on the array in the mRNA-containing cDNA populations tested.

[0015] The invention provides a method of preserving the resolution ofnucleic acid features on a first immobilized array during cycles ofarray replication, said method comprising the steps of: a) amplifyingthe features of a first array to yield an array of features with ahemispheric radius, r, and a cross-sectional area, q, at the surfacesupporting said array, such that said features remain essentiallydistinct; b) contacting said array of features with a radius, r, with asupport, maintained at a fixed distance from said first array, saidfixed distance less than r, and such that the cross-sectional area ofthe hemispheric feature, measured at said fixed distance from thesurface supporting said first array is less than q, and such that atleast a subset of nucleic acid molecules produced by said amplifying aretransferred to said support; c) covalently affixing said nucleic acidmolecules to said support to form a replica of said first immobilizedarray, wherein the positions of said nucleic acid molecules on saidreplica correspond to the positions of said nucleic acid molecules ofsaid first array from which they were amplified, and wherein the areasoccupied on the surface of said support by the individual features ofsaid replica are less than the areas occupied on the surface supportingsaid first immobilized array.

[0016] It is preferred that said amplifying be performed by PCR.

[0017] In another embodiment of the method of preserving the resolutionof nucleic acid features on a first immobilized array during cycles ofarray replication, the method is repeated to yield further replicas withpreserved resolution.

[0018] The invention provides a method for determining the nucleotidesequence of the features of an immobilized nucleic acid array, suchmethod comprising the steps of: a) ligating a first double-strandednucleic acid probe to one end of a nucleic acid of a feature of saidarray, said first double stranded nucleic acid probe having arestriction endonuclease recognition site for a restriction endonucleasewhose cleavage site is separate from its recognition site and whichgenerates a protruding strand upon cleavage; b) identifying one or morenucleotides at the end of said polynucleotide by the identity of thefirst double stranded nucleic acid probe ligated thereto or by extendinga strand of the polynucleotide or probe; c) amplifying the features ofsaid array using a primer complementary to said first double strandednucleic acid probe, such that only molecules which have beensuccessfully ligated with said first double stranded nucleic acid probeare amplified to yield an amplified array; d) contacting said amplifiedarray with support such that at least a subset of nucleic acid moleculesproduced by said amplifying are transferred to said support; e)covalently attaching said subset of nucleic acid molecules to saidsupport to form a replica of said amplified array; f) cleaving thenucleic acid features of the array with a nuclease recognizing saidnuclease recognition site of said probe such that the nucleic acid ofthe features is shortened by one or more nucleotides; and g) repeatingsteps (a)-(f) until the nucleotide sequences of the features of saidarray are determined.

[0019] It is preferred that the nucleic acid probe comprises fourcomponents, each component being capable of indicating the presence of adifferent nucleotide in the protruding strand upon ligation. It isfurther preferred that each of the components of the probe is labeledwith a different fluorescent dye and that the different fluorescent dyesare spectrally resolvable.

[0020] In another embodiment of the invention, the features of the arrayare amplified after step (e) and before step (f).

[0021] It is preferred that the amplifying be accomplished by PCR.

[0022] In another embodiment, the method of determining the sequence ofthe features of an immobilized nucleic acid array is modified such that:i) after one or more cycles using said first double stranded nucleicacid probe in step (a), a distinct nucleic acid probe is used, in placeof said first double stranded nucleic probe, said distinct nucleic acidprobe comprising a restriction endonuclease recognition site for arestriction endonuclease whose cleavage site is separated from itsrecognition site, said distinct nucleic acid probe also comprisingsequences such that a primer complementary to said distinct nucleic acidprobe will not hybridize with said first double stranded nucleic acidprobe; and ii) a primer complementary to said distinct nucleic acidprobe is used in place of said primer complementary to said first doublestranded nucleic acid probe in step (c), so that selective amplificationof those features which successfully completed the previous cycle ofrestriction and ligation occurs.

[0023] In another embodiment of this modified method of determining thenucleotide sequence of the features of an immobilized nucleic acidarray, a new distinct nucleic acid probe is used after each cycle ofrestriction and ligation, said new distinct nucleic acid probecomprising a sequence such that a primer complementary to that sequencewill not hybridize to any probe used in previous cycles.

[0024] The invention provides a method of determining the nucleotidesequence of the features of an array of immobilized nucleic acidscomprising the steps of: a) adding a mixture comprising anoligonucleotide primer and a template-dependent polymerase to an arrayof immobilized nucleic acid features under conditions permittinghybridization of the primer to the immobilized nucleic acids; b) addinga single, fluorescently labeled deoxynucleoside triphosphate to themixture under conditions which permit incorporation of the labeleddeoxynucleotide onto the 3′ end of the primer if it is complementary tothe next adjacent base in the sequence to be determined; c) detectingincorporated label by monitoring fluorescence; d) repeating steps(b)-(c) with each of the remaining three labeled deoxynucleosidetriphosphates in turn; and e) repeating steps (b)-(d) until thenucleotide sequence is determined.

[0025] In a preferred embodiment, the primer, buffer and polymerase arecast into a polyacrylamide gel bearing the array of immobilized nucleicacids.

[0026] It is preferred that the single fluorescently labeleddeoxynucleotide further comprises a mixture of the singledeoxynucleoside triphosphate in labeled and unlabeled forms.

[0027] In another embodiment, the additional step of photobleaching saidarray is performed after step (d) and before step (e).

[0028] In another embodiment, the fluorescently labeled deoxynucleosidetriphosphates are labeled with a cleavable linkage to the fluorophore,and the additional step of cleaving said linkage to the fluorophore isperformed after step (d) and before step (e).

[0029] In another embodiment, the oligonucleotide primer comprisessequences permitting formation of a hairpin loop.

[0030] In another embodiment, after a predetermined number of cycles ofsteps (b)-(d), a defined regimen of deoxynucleotide andchain-terminating deoxynucleotide analog addition is performed, suchthat out-of-phase molecules are blocked from further extension cycles,said regimen followed by continued cycles of steps (b)-(d) until thenucleotide sequence of the features of the array is determined.

[0031] The invention provides a method of determining the nucleotidesequence of the features of an array of immobilized nucleic acidscomprising the steps of: a) adding a mixture comprising anoligonucleotide primer and a template-dependent polymerase to an arrayof immobilized nucleic acid features under conditions permittinghybridization of the primer to the immobilized nucleic acids; b) addinga first mixture of three unlabeled deoxynucleoside triphosphates underconditions which permit incorporation of deoxynucleotides to the end ofthe primer if they are complementary to the next adjacent base in thesequence to be determined; c) adding a second mixture of three unlabeleddeoxynucleoside triphosphates, along with buffer and polymerase ifnecessary, said second mixture comprising the deoxynucleosidetriphosphate not included in the mixture of step (b), under conditionswhich permit incorporation of deoxynucleotides to the end of the primerif they are complementary to the next adjacent base in the sequence tobe determined; d) repeating steps (b)-(c) for a predetermined number ofcycles; e) adding a single, fluorescently labeled deoxynucleosidetriphosphate to the mixture under conditions which permit incorporationof the labeled deoxynucleotide onto the 3′ terminus of the primer if itis complementary to the next adjacent base in the sequence to bedetermined; f) detecting incorporated label by monitoring fluorescence;g) repeating steps (e)-(f), with each of the remaining three labeleddeoxynucleoside triphosphates in turn; and h) repeating steps (e)-(g)until the nucleotide sequence is determined.

[0032] It is preferred that for the first or second mixtures of threeunlabeled deoxynucleoside triphosphates, a mixture which comprisesdeoxyguanosine triphosphate further comprises deoxyadenosinetriphosphate.

[0033] In a preferred embodiment, method the primer and polymerase arecast into a polyacrylamide gel bearing the array of immobilized nucleicacids.

[0034] In a preferred embodiment, the single fluorescently labeleddeoxynucleotide further comprises a mixture of the singledeoxynucleoside triphosphate in labeled and unlabeled forms.

[0035] In another embodiment of this method of determining thenucleotide sequence of nucleic acid features on an array, the additionalstep of photobleaching the array is performed after step (g) and beforestep (h).

[0036] In another embodiment of this method of determining thenucleotide sequence of nucleic acid features on an array, thefluorescently labeled deoxynucleoside triphosphates are labeled with acleavable linkage to the fluorophore and after step (g) and before step(h) the additional step of cleaving the linkage to the fluorophore isperformed.

[0037] In another embodiment of this method of determining thenucleotide sequence of nucleic acid features on an array, theoligonucleotide primer comprises sequences permitting formation of ahairpin loop.

[0038] In another embodiment of this method of determining thenucleotide sequence of nucleic acid features on an array, after apredetermined number of cycles of steps (e)-(g), a defined regimen ofdeoxynucleotide and chain-terminating deoxynucleotide analog addition isperformed, such that out-of-phase molecules are blocked from furtherextension cycles, said regimen followed by continued cycles of steps(e)-(g) until said nucleotide sequence is determined.

[0039] The invention provides a method of determining the nucleotidesequence of the features of a micro-array of nucleic acid molecules,said method comprising the steps of: a) creating a micro-array ofnucleic acid features in a linear arrangement within and along one sideof a polyacrylamide gel, said gel further comprising one or moreoligonucleotide primers, and a template-dependent polymerizing activity;b) amplifying the microarray; c) adding a mixture of deoxynucleosidetriphosphates, said mixture comprising each of the four deoxynucleosidetriphosphates DATP, dGTP, dCTP and dTTP, said mixture further comprisingchain-terminating analogs of each of the deoxynucleoside triphosphatesdATP, dGTP, dCTP and dTTP, and said chain-terminating analogs eachdistinguishably labeled with a spectrally distinguishable fluorescentmoiety; d) incubating said mixture with said micro-array underconditions permitting extension of said one or more oligonucleotideprimers; e) electrophoretically separating the products of saidextension within said polyacrylamide gel; and f) determining thenucleotide sequence of the features of said micro-array by detecting thefluorescence of the extended, terminated and separated reaction productswithin the gel.

[0040] It is preferred that the amplifying be performed by PCR.

[0041] In another embodiment, the amplifying may be performed by anisothermal method.

[0042] In another embodiment the microarray of nucleic acid features ina linear arrangement is derived as a replica of features arranged on achromosome.

[0043] In another embodiment the microarray of nucleic acid features ina linear arrangement is derived as a replica of one linear subset offeatures on a separate, non-linear micro-array of nucleic acid features.

[0044] The invention provides a method of simultaneously amplifying aplurality of nucleic acids, said method comprising the steps of: a)creating a micro-array of immobilized oligonucleotide primers; b)incubating the microarray with amplification template and anon-immobilized oligonucleotide primer under conditions allowinghybridization of said template with said oligonucleotide primers; c)incubating the hybridized primers and template with a DNA polymeraseactivity, and deoxynucleotide triphosphates under conditions permittingextension of the primers; d) repeating steps (b) and (c) for a definednumber of cycles to yield a plurality of amplified DNA molecules.

[0045] It is preferred that the non-immobilized oligonucleotide primercomprises a pool of oligonucleotide primers comprised of 5′ and 3′sequence elements, said 5′ sequence element identical in all members ofsaid pool, and said 3′ sequence element containing random sequences.

[0046] It is preferred that the 5′ sequence element comprises arestriction endonuclease recognition sequence.

[0047] In another embodiment, the 5′ sequence element comprises atranscriptional promoter sequence.

[0048] In another embodiment, the immobilized primers are amplifiedbefore step (b).

[0049] In another embodiment, the immobilized oligonucleotide primersare generated from genomic DNA.

[0050] In a preferred embodiment, the microarray, template,non-immobilized primer, and polymerase are cast in a polyacrylamide gel.

[0051] The invention provides a method of making an immobilized nucleicacid molecule array, the method comprising: a) providing template DNAand a pair of PCR primers, wherein at least one member of the pair isAcrydite modified; b) mixing the template DNA and PCR primers with asolution comprising acrylamide monomers; c) contacting the mixture ofstep (b) with a solid support and polymerizing the acrylamide monomers;and d) amplifying the template DNA by PCR to generate an immobilizednucleic acid molecule array.

[0052] In a preferred embodiment, the solid support is a glassmicroscope slide.

[0053] In another preferred embodiment, the solution comprisingacrylamide monomers further comprises a template-dependent DNApolymerase.

[0054] In another preferred embodiment, the polymerase is Taq DNApolymerase.

[0055] In another preferred embodiment, the template DNA comprisesbinding sites for the pair of PCR primers, with one binding site on eachside of a variable sequence.

[0056] In another preferred embodiment, the template DNA comprises alibrary.

[0057] The invention provides a method of making a plurality of animmobilized nucleic acid molecule array, the method comprising: a)providing template DNA and a pair of PCR primers, wherein at least onemember of the pair of PCR primers is Acrydite modified; b) mixing thetemplate DNA and pair of PCR primers with a solution comprisingacrylamide monomers; c) contacting the mixture of step (b) with a solidsupport that binds to polyacrylamide, and polymerizing the acrylamidemonomers to form a first layer; d) contacting the first layer with amixture comprising the pair of PCR primers and acrylamide monomers, andpolymerizing the acrylamide monomers to form a second layer; e)amplifying the template DNA by PCR to generate an immobilized nucleicacid molecule array; f) removing the second layer, wherein the secondlayer comprises a duplicate of the array; and g) repeating steps d-f oneor more times to generate a plurality of an immobilized nucleic acidmolecule array.

[0058] In a preferred embodiment, the solid support is a glassmicroscope slide.

[0059] In another preferred embodiment, the solution comprisingacrylamide monomers further comprises a thermostable, template-dependentDNA polymerase.

[0060] In another preferred embodiment, the polymerase is Taq DNApolymerase.

[0061] In another preferred embodiment, the template DNA comprisesbinding sites for the pair of PCR primers, with one binding site on eachside of a variable sequence.

[0062] In another preferred embodiment, the template DNA comprises alibrary.

[0063] As used herein in reference to nucleic acid arrays, the term“plurality” is defined as designating two or more such arrays, wherein afirst (or “template”) array plus a second array made from it comprise aplurality. When such a plurality comprises more than two arrays, arraysbeyond the second array may be produced using either the first array orany copy of it as a template.

[0064] As used herein, the terms “randomly-patterned” or “random” referto a non-ordered, non-Cartesian distribution (in other words, notarranged at pre-determined points along the x- and y axes of a grid orat defined ‘clock positions’, degrees or radii from the center of aradial pattern) of nucleic acid molecules over a support, that is notachieved through an intentional design (or program by which such adesign may be achieved) or by placement of individual nucleic acidfeatures. Such a “randomly-patterned” or “random” array of nucleic acidsmay be achieved by dropping, spraying, plating or spreading a solution,emulsion, aerosol, vapor or dry preparation comprising a pool of nucleicacid molecules onto a support and allowing the nucleic acid molecules tosettle onto the support without intervention in any manner to directthem to specific sites thereon.

[0065] As used herein, the terms “immobilized” or “affixed” refer tocovalent linkage between a nucleic acid molecule and a support matrix.

[0066] As used herein, the term “array” refers to a heterogeneous poolof nucleic acid molecules that is distributed over a support matrix;preferably, these molecules differing in sequence are spaced at adistance from one another sufficient to permit the identification ofdiscrete features of the array.

[0067] As used herein, the term “heterogeneous” is defined to refer to apopulation or collection of nucleic acid molecules that comprises aplurality of different sequences; it is contemplated that aheterogeneous pool of nucleic acid molecules results from a preparationof RNA or DNA from a cell which may be unfractionated orpartially-fractionated.

[0068] An “unfractionated” nucleic acid preparation is defined as thatwhich has not undergone the selective removal of any sequences presentin the complement of RNA or DNA, as the case may be, of the biologicalsample from which it was prepared. A nucleic acid preparation in whichthe average molecular weight has been lowered by cleaving the componentnucleic acid molecules, but which still retains all sequences, is still“unfractionated” according to this definition, as it retains thediversity of sequences present in the biological sample from which itwas prepared.

[0069] A “partially-fractionated” nucleic acid preparation may haveundergone qualitative size-selection. In this case, uncleaved sequences,such as whole chromosomes or RNA molecules, are selectively retained orremoved based upon size. In addition, a “partially-fractionated”preparation may comprise molecules that have undergone selection throughhybridization to a sequence of interest; alternatively, a“partially-fractionated” preparation may have had undesirable sequencesremoved through hybridization. It is contemplated that a“partially-fractionated” pool of nucleic acid molecules will notcomprise a single sequence that has been enriched after extraction fromthe biological sample to the point at which it is pure, or substantiallypure.

[0070] In this context, “substantially pure” refers to a single nucleicacid sequence that is represented by a majority of nucleic acidmolecules of the pool. Again, this refers to enrichment of a sequence invitro; obviously, if a given sequence is heavily represented in thebiological sample, a preparation containing it is not excluded from useaccording to the invention.

[0071] As used herein, the term “biological sample” refers to a wholeorganism or a subset of its tissues, cells or component parts (e.g.fluids). “Biological sample” further refers to a homogenate, lysate orextract prepared from a whole organism or a subset of its tissues, cellsor component parts, or a fraction or portion thereof. Lastly,“biological sample” refers to a medium, such as a nutrient broth or gelin which an organism has been propagated, which contains cellularcomponents, such as nucleic acid molecules.

[0072] As used herein, the term “organism” refers to all cellularlife-forms, such as prokaryotes and eukaryotes, as well as non-cellular,nucleic acid-containing entities, such as bacteriophage and viruses.

[0073] As used herein, the term “feature” refers to each nucleic acidsequence occupying a discrete physical location on the array; if a givensequence is represented at more than one such site, each site isclassified as a feature. In this context, the term “nucleic acidsequence” may refer either to a single nucleic acid molecule, whetherdouble or single-stranded, to a “clone” of amplified copies of a nucleicacid molecule present at the same physical location on the array or to areplica, on a separate support, of such a clone.

[0074] As used herein, the term “amplifying” refers to production ofcopies of a nucleic acid molecule of the array via repeated rounds ofprimed enzymatic synthesis; “in situ amplification” indicates that suchamplifying takes place with the template nucleic acid moleculepositioned on a support according to the invention, rather than insolution.

[0075] As used herein, the term “support” refers to a matrix upon whichnucleic acid molecules of a nucleic acid array are immobilized;preferably, a support is semi-solid.

[0076] As used herein, the term “semi-solid” refers to a compressiblematrix with both a solid and a liquid component, wherein the liquidoccupies pores, spaces or other interstices between the solid matrixelements.

[0077] As used herein in reference to the physical placement of nucleicacid molecules or features and/or their orientation relative to oneanother on an array of the invention, the terms “correspond” or“corresponding” refer to a molecule occupying a position on a secondarray that is either identical to- or a mirror image of the position ofa molecule from which it was amplified on a first array which served asa template for the production of the second array, or vice versa, suchthat the arrangement of features of the array relative to one another isconserved between arrays of a plurality.

[0078] As implied by the above statement, a first and second array of aplurality of nucleic acid arrays according to the invention may be ofeither like or opposite chirality, that is, the patterning of thenucleic acid arrays may be either identical or mirror-imaged.

[0079] As used herein, the term “replica” refers to any nucleic acidarray that is produced by a printing process according to the inventionusing as a template a first randomly-patterned immobilized nucleic acidarray.

[0080] As used herein, the term “spot” as applied to a component of amicroarray refers to a discrete area of a surface containing a substancedeposited by mechanical or other means.

[0081] As used herein, “excluded volume” refers to the volume of spaceoccupied by a particular molecule to the exclusion of other suchmolecules.

[0082] As used herein, “excess of nucleic acid molecules” refers to anamount of nucleic acid molecules greater than the amount of entities towhich such nucleic acid molecules may bind. An excess may comprise asfew as one molecule more than the number of binding entities, to twicethe number of binding entities, up to 10 times, 100 times, 1000 timesthe number of binding entities or more.

[0083] As used herein, “signal amplification method” refers to anymethod by which the detection of a nucleic acid is accomplished.

[0084] As used herein, a “nucleic acid capture ligand” or “nucleic acidcapture activity” refers to any substance which binds nucleic acidmolecules, either specifically or non-specifically, or which binds anaffinity tag attached to a nucleic acid molecule in such a way as toimmobilize the nucleic acid molecule to a support bearing the captureligand.

[0085] As used herein, “replica-destructive” refers to methods of signalamplification which render an array or replica of an array non-reusable.

[0086] As used herein, the term “non-reusable,” in reference to an arrayor replica of an array, indicates that, due to the nature of detectionmethods employed, the array cannot be replicated nor used for subsequentdetection methods after the first detection method is performed.

[0087] As used herein, the term “essentially distinct” as applied tofeatures of an array refers to the situation where 90% or more of thefeatures of an array are not in contact with other features on the samearray.

[0088] As used herein, the term “preserved” as applied to the resolutionof nucleic acid features on an array means that the features remainessentially distinct after a given process has been performed.

[0089] As used herein, the term “distinguishable” as applied to a label,refers to a labeling moiety which can be detected when among otherlabeling moieties.

[0090] As used herein, the term “spectrally distinguishable” or“spectrally resolvable” as applied to a label, refers to a labelingmoiety which can be detected by its characteristic fluorescentexcitation or emission spectra, one or both of such spectradistinguishing said moiety from other moieties used separately orsimultaneously in the particular method.

[0091] As used herein, the term “chain-terminating analog” refers to anynucleotide analog which, once incorporated onto the 3′ end of a nucleicacid molecule, cannot serve as a substrate for further addition ofnucleotides to that nucleic acid molecule.

[0092] As used herein, the term “type IIS” refers to a restrictionenzyme that cuts at a site remote from its recognition sequence. Suchenzymes are known to cut at a distances from their recognition sitesranging from 0 to 20 base pairs.

[0093] It is preferred that the support is semi-solid.

[0094] Preferably, the semi-solid support is selected from the groupthat includes polyacrylamide, cellulose, polyamide (nylon) andcross-linked agarose, -dextran and -polyethylene glycol.

[0095] It is particularly preferred that amplifying of nucleic acidmolecules of is performed by polymerase chain reaction (PCR).

[0096] Preferably, affixing of nucleic acid molecules to the support isperformed using a covalent linker that is selected from the group thatincludes oxidized 3-methyl uridine, an acrylyl group and hexaethyleneglycol. Additionally, Acrydite oligonucleotide primers may be covalentlyfixed within a polyacrylamide gel.

[0097] It is also contemplated that affixing of nucleic acid moleculesto the support is performed via hybridization of the members of the poolto nucleic acid molecules that are covalently bound to the support.

[0098] As used herein, the term “synthetic oligonucleotide” refers to ashort (10 to 1,000 nucleotides in length), double- or single-strandednucleic acid molecule that is chemically synthesized or is the productof a biological system such as a product of primed or unprimed enzymaticsynthesis.

[0099] As used herein, the term “template DNA” refers to a plurality ofDNA molecules used as the starting material or template for manufactureof a nucleic acid array such as a polyacrylamide-immobilized nucleicacid array.

[0100] As used herein, the term “template nucleic acids” refers to aplurality of nucleic acid molecules used as the starting material ortemplate for manufacture of a nucleic acid array.

[0101] As used herein, the term “amplification primer” refers to anoligonucleotide that may be used as a primer for amplificationreactions. The term “PCR primer” refers to an oligonucleotide that maybe used as a primer for the polymerase chain reaction. A PCR primer ispreferably, but not necessarily, synthetic, and will generally beapproximately 10 to 100 nucleotides in length.

[0102] As used herein, the term “Acrydite modified” in reference to anoligonucleotide means that the oligonucleotide has an Acryditephosphoramidite group attached to the 5′ end of the molecule.

[0103] As used herein, the term “thermostable, template-dependent DNApolymerase” refers to an enzyme capable of conducting primed enzymaticsynthesis following incubation at a temperature, greater than 65° C. andless than or equal to approximately 100° C., and for a time, rangingfrom about 15 seconds to about 5 minutes, that is sufficient to denatureessentially all double stranded DNA molecules in a given population.

[0104] As used herein, the term “solid support” refers to a support fora polyacrylamide-immobilized nucleic acid array, such support beingessentially non-compressible and lacking pores containing liquid. Asolid support is preferably thin and thermally conductive, such thatchanges in thermal energy characteristic of PCR thermal cycling areconducted through the support to permit amplification of PCR templatemolecules arrayed on its surface.

[0105] As used herein, the term “binding sites” when used in referenceto a nucleic acid molecule, means sequences that hybridize underselected PCR annealing conditions with a selected PCR primer. Bindingsites for PCR primers are generally used in pairs situated on eitherside of a sequence to be amplified, with each member of the pairpreferably comprising a sequence from the other member of the pair.

[0106] As used herein, the term “variable sequence” refers to a sequencein a population of nucleic acid molecules that varies between differentmembers of the population. Generally, as used herein, a variablesequence is flanked on either side by sequences that are shared orconstant among all members of that population.

BRIEF DESCRIPTION OF THE DRAWINGS

[0107]FIG. 1 shows the results six cycles of nucleotide addition anddetection in polyacrylamide gel matrix fluorescent sequencing reactionson two different template nucleic acid samples. The top panel shows afluorescent scan of the array after addition of fluorescently labeleddCTP, and the bottom panel shows schematics of sequencing templatesamples 1 and 2 with expected extension products.

[0108]FIG. 2 shows the result of the addition of fluorescently labeledTTP in the eighth cycle of addition, detection, and cleavage inpolyacrylamide gel matrix fluorescent sequencing reactions when the nextcorrect nucleotide was an A. The top panel shows a fluorescent scan, andthe bottom panel shows schematics of the expected extension products forsequencing template samples 1 and 2.

[0109]FIG. 3 shows the result of the addition of fluorescently labeleddCTP in the tenth cycle of addition, detection and cleavage inpolyacrylamide gel matrix fluorescent sequencing reactions of templatesamples 1 and 2. The panels are arranged as in FIG. 2.

[0110]FIG. 4 shows the result of the addition of fluorescently labeledTTP in the twelfth cycle of addition, detection and cleavage inpolyacrylamide gel matrix fluorescent sequencing reactions of templatesamples 1 and 2. The panels are arranged as in FIG. 2.

[0111]FIG. 5 is a schematic drawing of a disulfide-bonded cleavablenucleotide fluorophore complex useful in the methods of the invention.

[0112]FIG. 6 shows the results of experiments establishing the functionof cleavable linkers in polyacrylamide gel matrix fluorescent sequencingreactions. The top panels show fluorescent scans of primer extensionreactions, on two separate sequencing templates, in polyacrylamide spotsusing nucleotides with non-cleavably (Cy5-dCTP) and cleavably(Cy5-SS-dCTP) linked fluorescent label, before and after cleavage withdithiothreitol (DTT). The bottom panel shows schematics of sequencingtemplates 1 and 2 with the expected extension products.

[0113]FIG. 7 is a schematic drawing of a nucleic acid template useful inmaking arrays according to the invention. Two constant regions flank aregion of variable sequence.

[0114]FIG. 8 shows the amplification of array features within a gelmatrix.

[0115]FIG. 8A shows amplified arrays made using various amounts ofstarting template nucleic acid.

[0116]FIG. 8B shows the linear relationship between the amount ofstarting template nucleic acid and the number of amplified arrayfeatures.

[0117]FIG. 8C shows an agarose gel containing PCR amplification productsfrom a picked and re-amplified array feature.

[0118]FIG. 9 shows the results of experiments examining the relationshipof amplified feature size to template length and gel concentration.

[0119]FIG. 9A shows a plot of the radius of array features versus thelog of the template length.

[0120]FIG. 9B shows array features created from a 1009 base pairtemplate in a 15% polyacrylamide matrix.

[0121]FIG. 10 shows a replica of a nucleic acid array made in apolyacrylamide gel matrix according to the methods of the invention.

[0122]FIG. 10A shows the original array, and

[0123]FIG. 10B shows a replica of the array of FIG. 10A.

DETAILED DESCRIPTION OF THE INVENTION

[0124] The present invention is directed to the synthesis of nucleicacid array chips, methods by which such chips may be reproduced andmethods by which they may be used in diverse applications relating tonucleic acid replication or amplification, genomic characterization,gene expression studies, medical diagnostics and population genetics.The nucleic acid array chips of the replica array has several advantagesover the presently available methods.

[0125] Besides any known sequences or combinatorial sequence thereof, afull genome including unknown DNA sequences can be replicated accordingto the present invention. The size of the nucleic acid fragments orprimers to be replicated can be from about 25-mer to about 9000-mer. Thepresent invention is also quick and cost effective. It takes about onlyabout one week from discovery of an organism to arrange the full genomesequence of the organism onto chips with about $10 per chip. Inaddition, the thickness of the chips is 3000 nm which provides a muchhigher sensitivity. The chips are compatible with inexpensive in situPCR devices, and can be reused as many as 100 times.

[0126] The invention provides for an advance over the arrays ofChetverin and Kramer (WO 93/17126), Chetverin and Chetverina, 1997 (U.S.Pat. No. 5,616,478), and others, in that a method is herein described bywhich to produce a random nucleic acid array both that is covalentlylinked to a support (therefore extensively reusable) and that permitsone to fabricate high-fidelity copies of it without returning to thestarting point of the process, thereby eliminating time-consuming,expensive steps and providing for reproducible results both when thecopies of the array are made and when they are used. It is evident thatthis method is not obvious, despite its great utility. No mention ofreplica plating or printing of amplimers in this context appears to havebeen made in oligonucleotide array patents or papers. There is no methodin the prior art for generating a set of nucleic acid arrays comprisingthe steps of covalently linking a pool of nucleic acid molecules to asupport to form a random array, amplifying the nucleic acid moleculesand subsequently replicating the array.

[0127] While reproducibility of manufacture and durability are not ofsignificant concern in the making of arrays in which the nucleic acidmolecules are chemically synthesized directly on the support, they arecentrally important in cases in which the molecules of the array are ofnatural origin (for example, a sample of mRNA from an organism). Eachnucleic acid sample obtained from a natural source constitutes a uniquepool of molecules; these molecules are, themselves, uniquely distributedover the surface of the support, in that the original laying out of thepattern is random. By any prior art method, an array generated fromsimple, random deposition of a pool of nucleic acid molecules isirreproducible; however, a set of related arrays would be of greatutility, since information derived from any one copy from the replicatedset would increase the confidence in the identity and/or quality of datagenerated using the other members of the set.

[0128] The methods provided in the present invention basically consistsof 5 steps: 1) providing a pool of nucleic acid molecules, 2) plating orother transfer of the pool onto a solid support, 3) in situamplification, 4) replica printing of the amplified nucleic acids and 5)identification of features. Sets of arrays so produced, or membersthereof, then may be put to any chip affinity readout use, some of whichare summarized below. The production of a set of arrays according to theinvention is described in Example 1. The following examples are providedfor exemplification purposes only and are not intended to limit thescope of the invention which has been described in broad terms above.

EXAMPLE 1

[0129] Production of a Plurality of a Nucleic Acid Array According tothe Invention

[0130] Step 1. Production of a Nucleic Acid Pool with Which to Constructan Array According to the Invention

[0131] A pool or library of n-mers (n=20 to 9000) is made by any ofseveral methods. The pool is either amplified (e.g. by PCR) or leftunamplified. A suitable in vitro amplification “vector,” for example,flanking PCR primer sequences or an in vivo plasmid, phage or viralvector from which amplified molecules are excised prior to use, is used.If necessary, random shearing or enzymatic cleavage of large nucleicacid molecules is used to generate the pools if the nucleic acidmolecules are amplified, cleavage is performed either before or afteramplification. Alternatively, a nucleic acid sample is random primed,for example with tagged 3′ terminal hexamers followed by electrophoreticsize-selection. The nucleic acid is selected from genomic, synthetic orcDNA sequences (Power, 1996, J. Hosp. Infect., 34: 247-265; Welsh, etal., 1995, Mutation Res., 338: 215-229). The copied or unamplifiednucleic acid fragments resulting from any of the above procedures are,if desired, fractionated by size or affinity by a variety of methodsincluding electrophoresis, sedimentation, and chromatography (possiblyincluding elaborate, expensive procedures or limited-quantity resourcessince the subsequent inexpensive replication methods can justify suchinvestment of effort).

[0132] Pools of nucleic acid molecules are, at this stage, applieddirectly to the support medium (see Step 2, below). Alternatively, theyare cloned into nucleic acid vectors. For example, pools composed offragments with inherent polarity, such as cDNA molecules, aredirectionally cloned into nucleic acid vectors that comprise, at thecloning site, oligonuoleotide linkers that provide asymmetric flankingsequences to the fragments. Upon their subsequent removal viarestriction with enzymes that cleave the vector outside both the clonedfragment and linker sequences, molecules with defined (and different)sequences at their two ends are generated. By denaturing these moleculesand spreading them onto a semi-solid support to which is covalentlybound oligonucleotides that are complementary to one preferred flankinglinker, the orientation of each molecule in the array is determinedrelative to the surface of the support. Such a polar array is of use forin vitro transcription/translation of the array or any purpose for whichdirectional uniformity is preferred.

[0133] In addition to the attachment of linker sequences to themolecules of the pool for use in directional attachment to the support,a restriction site or regulatory element (such as a promoter element,cap site or translational termination signal), is, if desired, joinedwith the members of the pool. The use of fragments with terminiengineered to comprise useful restriction sites is described below inExample 6.

[0134] Step 2. Transfer of the Nucleic Acid Pool onto a Support Medium

[0135] The nucleic acid pool is diluted (“plated”) out onto a semi-solidmedium (such as a polyacrylamide gel) on a solid surface such as a glassslide such that amplifiable molecules are 0.1 to 100 micrometers apart.Sufficient spacing is maintained that features of the array do notcontaminate one another during repeated rounds of amplification andreplication. It is estimated that a molecule that is immobilized at oneend can, at most, diffuse the distance of a single molecule lengthduring each round of replication. Obviously, arrays of shorter moleculesare plated at higher density than those comprising long molecules.

[0136] Immobilizing media that are of use according to the invention arephysically stable and chemically inert under the conditions required fornucleic acid molecule deposition, amplification and the subsequentreplication of the array. A useful support matrix withstands the rapidchanges in- and extremes of temperature required for PCR and retainsstructural integrity under stress during the replica printing process.The support material permits enzymatic nucleic acid synthesis; if it isunknown whether a given substance will do so, it is tested empiricallyprior to any attempt at production of a set of arrays according to theinvention. The support structure comprises a semi-solid (i.e.gelatinous) lattice or matrix, wherein the interstices or pores betweenlattice or matrix elements are filled with an aqueous or other liquidmedium; typical pore (or ‘sieve’) sizes are in the range of 100 μm to 5nm. Larger spaces between matrix elements are within tolerance limits,but the potential for diffusion of amplified products prior to theirimmobilization is increased. The semi-solid support is compressible, sothat full surface-to-surface contact, essentially sufficient to form aseal between two supports, although that is not the object, may beachieved during replica printing. The support is prepared such that itis planar, or effectively so, for the purposes of printing; for example,an effectively planar support might be cylindrical, such that thenucleic acids of the array are distributed over its outer surface inorder to contact other supports, which are either planar or cylindrical,by rolling one over the other. Lastly, a support materials of useaccording to the invention permits immobilizing (covalent linking) ofnucleic acid features of an array to it by means enumerated below.Materials that satisfy these requirements comprise both organic andinorganic substances, and include, but are not limited to,polyacrylamide, cellulose and polyamide (nylon), as well as cross-linkedagarose, dextran or polyethylene glycol.

[0137] Of the support media upon which the members of the pool ofnucleic acid molecules may be anchored, one that is particularlypreferred is a thin, polyacylamide gel on a glass support, such as aplate, slide or chip. A polyacrylamide sheet of this type is synthesizedas follows: Acrylamide and bis-acrylamide are mixed in a ratio that isdesigned to yield the degree of crosslinking between individual polymerstrands (for example, a ratio of 38:2 is typical of sequencing gels)that results in the desired pore size when the overall percentage of themixture used in the gel is adjusted to give the polyacrylamide sheet itsrequired tensile properties. Polyacrylamide gel casting methods are wellknown in the art (see Sambrook et al., 1989, Molecular Cloning. ALaboratory Manual., 2nd Edition, Cold Spring Harbor Laboratory Press,Cold Spring Harbor, N.Y.), and one of skill has no difficulty in makingsuch adjustments.

[0138] The gel sheet is cast between two rigid surfaces, at least one ofwhich is the glass to which it will remain attached after removal of theother. The casting surface that is to be removed after polymerization iscomplete is coated with a lubricant that will not inhibit gelpolymerization; for this purpose, silane is commonly employed. A layerof silane is spread upon the surface under a fume hood and allowed tostand until nearly dry. Excess silane is then removed (wiped or, in thecase of small objects, rinsed extensively) with ethanol. The glasssurface which will remain in association with the gel sheet is treatedwith γ-methacryloxypropyltrimethoxysilane (Cat. No. M6514, Sigma; St.Louis, Mo.), often referred to as ‘crosslink silane’, prior to casting.The glass surface that will contact the gel is triply-coated with thisagent. Each treatment of an area equal to 1200 cm² requires 125 μl ofcrosslink silane in 25 ml of ethanol. Immediately before this solutionis spread over the glass surface, it is combined with a mixture of 750μl water and 75 μl glacial acetic acid and shaken vigorously. Theethanol solvent is allowed to evaporate between coatings (about 5minutes under a fume hood) and, after the last coat has dried, excesscrosslink silane is removed as completely as possible via extensiveethanol washes in order to prevent ‘sandwiching’ of the other supportplate onto the gel. The plates are then assembled and the gel cast asdesired.

[0139] The only operative constraint that determines the size of a gelthat is of use according to the invention is the physical ability of oneof skill in the art to cast such a gel. The casting of gels of up to onemeter in length is, while cumbersome, a procedure well known to workersskilled in nucleic acid sequencing technology. A larger gel, ifproduced, is also of use according to the invention. An extremely smallgel is cut from a larger whole after polymerization is complete.

[0140] Note that at least one procedure for casting a polyacrylamide gelwith bioactive substances, such as enzymes, entrapped within its matrixis known in the art (O'Driscoll, 1976, Methods Enzymol., 44: 169-183); asimilar protocol, using photo-crosslinkable polyethylene glycol resins,that permit entrapment of living cells in a gel matrix has also beendocumented (Nojima and Yamada, 1987, Methods Enzymol., 136: 380-394).Such methods are of use according to the invention. As mentioned below,whole cells are typically cast into agarose for the purpose ofdelivering intact chromosomal DNA into a matrix suitable forpulsed-field gel electrophoresis or to serve as a “lawn” of host cellsthat will support bacteriophage growth prior to the lifting of plaquesaccording to the method of Benton and Davis (see Maniatis et al., 1982,Molecular Cloning: A Laboratory Manual, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.). In short, electrophoresis-gradeagarose (e.g. Ultrapure; Life Technologies/Gibco-BRL; is dissolved in aphysiological (isotonic) buffer and allowed to equilibrate to atemperature of 50° to 52° C. in a tube, bottle or flask. Cells are thenadded to the agarose and mixed thoroughly, but rapidly (if in a bottleor tube, by capping and inversion, if in a flask, by swirling), beforethe mixture is decanted or pipetted into a gel tray. If low-meltingpoint agarose is used, it may be brought to a much lower temperature(down to approximately room temperature, depending upon theconcentration of the agarose) prior to the addition of cells. This isdesirable for some cell types; however, if electrophoresis is to followcell lysis prior to covalent attachment of the molecules of theresultant nucleic acid pool to the support, it is performed underrefrigeration, such as in a 4° to 10° C. ‘cold’ room.

[0141] Immobilization of nucleic acid molecules to the support matrixaccording to the invention is accomplished by any of several procedures.Direct immobilizing, as through use of 3′-terminal tags bearing chemicalgroups suitable for covalent linkage to the support, hybridization ofsingle-stranded molecules of the pool of nucleic acid molecules tooligonucleotide primers already bound to the support or the spreading ofthe nucleic acid molecules on the support accompanied by theintroduction of primers, added either before or after plating, that maybe covalently linked to the support, may be performed. Wherepre-immobilized primers are used, they are designed to capture a broadspectrum of sequence motifs (for example, all possible multimers of agiven chain length, e.g. hexamers), nucleic acids with homology to aspecific sequence or nucleic acids containing variations on a particularsequence motif. Alternatively, the primers encompass a syntheticmolecular feature common to all members of the pool of nucleic acidmolecules, such as a linker sequence (see above).

[0142] Oligonucleotide primers useful according to the invention aresingle-stranded DNA or RNA molecules that are hybridizable to a nucleicacid template to prime enzymatic synthesis of a second nucleic acidstrand. The primer is complementary to a portion of a target moleculepresent in a pool of nucleic acid molecules used in the preparation ofsets of arrays of the invention.

[0143] It is contemplated that such a molecule is prepared by syntheticmethods, either chemical or enzymatic. Alternatively, such a molecule ora fragment thereof is naturally occurring, and is isolated from itsnatural source or purchased from a commercial supplier. Oligonucleotideprimers are 6 to 100, and even up to 1,000, nucleotides in length, butideally from 10 to 30 nucleotides, although oligonucleotides ofdifferent length are of use.

[0144] Typically, selective hybridization occurs when two nucleic acidsequences are substantially complementary (at least about 65%complementary over a stretch of at least 14 to 25 nucleotides,preferably at least about 75%, more preferably at least about 90%complementary). See Kanehisa, M., 1984, Nucleic Acids Res. 12: 203,incorporated herein by reference. As a result, it is expected that acertain degree of mismatch at the priming site is tolerated. Suchmismatch may be small, such as a mono-, di- or tri-nucleotide.Alternatively, it may encompass loops, which we define as regions inwhich mismatch encompasses an uninterrupted series of four or morenucleotides.

[0145] Overall, five factors influence the efficiency and selectivity ofhybridization of the primer to a second nucleic acid molecule. Thesefactors, which are (i) primer length, (ii) the nucleotide sequenceand/or composition, (iii) hybridization temperature, (iv) bufferchemistry and (v) the potential for steric hindrance in the region towhich the primer is required to hybridize, are important considerationswhen non-random priming sequences are designed.

[0146] There is a positive correlation between primer length and boththe efficiency and accuracy with which a primer will anneal to a targetsequence; longer sequences have a higher T_(M) than do shorter ones, andare less likely to be repeated within a given target sequence, therebycutting down on promiscuous hybridization. Primer sequences with a highG-C content or that comprise palindromic sequences tend toself-hybridize, as do their intended target sites, since unimolecular,rather than bimolecular, hybridization kinetics are generally favored insolution; at the same time, it is important to design a primercontaining sufficient numbers of G-C nucleotide pairings to bind thetarget sequence tightly, since each such pair is bound by three hydrogenbonds, rather than the two that are found when A and T bases pair.Hybridization temperature varies inversely with primer annealingefficiency, as does the concentration of organic solvents, e.g.formamide, that might be included in a hybridization mixture, whileincreases in salt concentration facilitate binding. Under stringenthybridization conditions, longer probes hybridize more efficiently thando shorter ones, which are sufficient under more permissive conditions.Stringent hybridization conditions typically include salt concentrationsof less than about 1M, more usually less than about 500 mM andpreferably less than about 200 mM. Hybridization temperatures range fromas low as 0° C. to greater than 22° C., greater than about 30° C., and(most often) in excess of about 37° C. Longer fragments may requirehigher hybridization temperatures for specific hybridization. As severalfactors affect the stringency of hybridization, the combination ofparameters is more important than the absolute measure of any one alone.

[0147] Primers are designed with the above first four considerations inmind. While estimates of the relative merits of numerous sequences aremade mentally, computer programs have been designed to assist in theevaluation of these several parameters and the optimization of primersequences. Examples of such programs are “PrimerSelect” of the DNAStar™software package (DNAStar, Inc.; Madison, Wis.) and OLIGO 4.0 (NationalBiosciences, Inc.). Once designed, suitable oligonucleotides areprepared by a suitable method, e.g. the phosphoramidite method describedby Beaucage and Carruthers (1981, Tetrahedron Lett., 22: 1859-1862) orthe triester method according to Matteucci et al. (1981, J. Am. Chem.Soc., 103: 3185), both incorporated herein by reference, or by otherchemical methods using either a commercial automated oligonucleotidesynthesizer or VLSIPS™ technology.

[0148] Two means of crosslinking a nucleic acid molecule to a preferredsupport of the invention, a polyacrylamide gel sheet, will be discussedin some detail. The first (provided by Khrapko et al., 1996, U.S. Pat.No. 5,552,270) involves the 3′ capping of nucleic acid molecules with3-methyl uridine; using this method, the nucleic acid molecules of thelibraries of the present invention are prepared so as to include thismodified base at their 3′ ends. In the cited protocol, an 8%polyacrylamide gel (30: 1, acrylamide: bis-acrylamide) sheet 30 μm inthickness is cast and then exposed to 50% hydrazine at room temperaturefor 1 hour; such a gel is also of use according to the presentinvention. The matrix is then air dried to the extent that it willabsorb a solution containing nucleic acid molecules, as described below.Nucleic acid molecules containing 3-methyl uridine at their 3′ ends areoxidized with 1 mM sodium periodate (NaIO₄) for 10 minutes to 1 hour atroom temperature, precipitated with 8 to 10 volumes of 2% LiClO₄ inacetone and dissolved in water at a concentration of 10 pmol/μl. Thisconcentration is adjusted so that when the nucleic acid molecules arespread upon the support in a volume that covers its surface evenly, yetis efficiently (i.e. completely) absorbed by it, the density of nucleicacid molecules of the array falls within the range discussed above. Thenucleic acid molecules are spread over the gel surface and the platesare placed in a humidified chamber for 4 hours. They are then dried for0.5 hour at room temperature and washed in a buffer that is appropriateto their subsequent use. Alternatively, the gels are rinsed in water,re-dried and stored at −20° C. until needed. It is said that the overallyield of nucleic acid that is bound to the gel is 80% and that of thesemolecules, 98% are specifically linked through their oxidized 3′ groups.

[0149] A second crosslinking moiety that is of use in attaching nucleicacid molecules covalently to a polyacrylamide sheet is a 5′ acrylylgroup, which is attached to the primers used in Example 6.Oligonucleotide primers bearing such a modified base at their 5′ endsmay be used according to the invention. In particular, sucholigonucleotides are cast directly into the gel, such that the acrylylgroup becomes an integral, covalently-bonded part of the polymerizingmatrix. The 3′ end of the primer remains unbound, so that it is free tointeract with- and hybridize to a nucleic acid molecule of the pool andprime its enzymatic second-strand synthesis.

[0150] Alternatively, hexaethylene glycol is used to covalently linknucleic acid molecules to nylon or other support matrices (Adams andKron, 1994, U.S. Pat. No. 5,641,658). In addition, nucleic acidmolecules are crosslinked to nylon via irradiation with ultravioletlight. While the length of time for which a support is irradiated aswell as the optimal distance from the ultraviolet source is calibratedwith each instrument used, due to variations in wavelength andtransmission strength, at least one irradiation device designedspecifically for crosslinking of nucleic acid molecules to hybridizationmembranes is commercially available (Stratalinker; Stratagene). Itshould be noted that in the process of crosslinking via irradiation,limited nicking of nucleic acid strand occurs; however, the amount ofnicking is generally negligible under conditions such as those used inhybridization procedures. Attachment of nucleic acid molecules to thesupport at positions that are neither 5′- nor 3′-terminal also occurs,but it should be noted that the potential for utility of an array socrosslinked is largely uncompromised, as such crosslinking does notinhibit hybridization of oligonucleotide primers to the immobilizedmolecule where it is bonded to the support. The production of ‘terminal’copies of an array of the invention, i.e. those that will not serve astemplates for further replication, is not affected by the method ofcrosslinking; however, in situations in which sites of covalent linkageare, preferably, at the termini of molecules of the array, crosslinkingmethods other than ultraviolet irradiation are employed.

[0151] Step 3. Amplification of the Nucleic Acid Molecules of the Array

[0152] The molecules are amplified in situ (Tsongalis et al., 1994,Clinical Chemistry, 40: 381-384; see also review by Long and Komminoth,1997, Methods Mol. Biol., 71: 141-161) by standard molecular techniques,such as thermal-cycled PCR (Mullis and Faloona, 1987, Methods Enzymol.,155: 335-350) or isothermal 3SR (Gingeras et al., 1990, Annales deBiologie Clinique, 48(7): 498-501; Guatelli et al., 1990, Proc. Natl.Acad. Sci. U.S.A., 87: 1874). Another method of nucleic acidamplification that is of use according to the invention is the DNAligase amplification reaction (LAR), which has been described aspermitting the exponential increase of specific short sequences throughthe activities of any one of several bacterial DNA ligases (Wu andWallace, 1989, Genomics, 4: 560). The contents of this article areherein incorporated by reference.

[0153] The polymerase chain reaction (PCR), which uses multiple cyclesof DNA replication catalyzed by a thermostable, DNA-dependent DNApolymerase to amplify the target sequence of interest, is well known inthe art, and is presented in detail in the Examples below. The secondamplification process, 3SR, is an outgrowth of the transcription-basedamplification system (TAS), which capitalizes on the high promotersequence specificity and reiterative properties of bacteriophageDNA-dependent RNA polymerases to decrease the number of amplificationcycles necessary to achieve high amplification levels (Kwoh et al.,1989, Proc. Natl. Acad. Sci. U.S.A., 83: 1173-1177). The 3SR methodcomprises an isothermal, Self-Sustained Sequence Replicationamplification reaction, is as follows:

[0154] Each priming oligonucleotide contains the T7 RNA polymerasebinding sequence (TAATACGACTCACTATA [SEQ ID NO: 1]) and the preferredtranscriptional initiation site. The remaining sequence of each primeris complementary to the target sequence on the molecule to be amplified.

[0155] The 3SR amplification reaction is carried out in 100 μl andcontains the target RNA, 40 mM Tris·HCl, ph 8.1, 20 mM MgCl2, 2 mMspermidine·HCl, 5 mM dithiothreitol, 80 μg/ml BSA, 1 mM dATP, 1 mM dGTP,1 mM dTTP, 4 mMATP, 4 mM CTP, 1 mM GTP, 4 mM dTTP, 4 mM ATP, 4 mM CTP, 4mM GTP, 4 mMUTP, and a suitable amount of oligonucleotide primer (250 ngof a 57-mer; this amount is scaled up or down, proportionally, dependingupon the length of the primer sequence). Three to 6 attomoles of thenucleic acid target for the 3SR reactions is used. As a control forbackground, a 3SR reaction without any target (H₂O) is run. The reactionmixture is heated to 100° C. for 1 minute, and then rapidly chilled to42° C. After 1 minute, 10 units (usually in a volume of approximately 2μl) of reverse transcriptase, (e.g. avian myoblastosis virus reversetranscriptase, AMV-RT; Life Technologies/Gibco-BRL) is added. Thereaction is incubated for 10 minutes, at 42° C. and then heated to 100°C. for 1 minute. (If a 3SR reaction is performed using a single-strandedtemplate, the reaction mixture is heated instead to 65° C. for 1minute.) Reactions are then cooled to 37° C. for 2 minutes prior to theaddition of 4.6 μl of a 3SR enzyme mix, which contains 1.6 μl of AMV-RTat 18.5 units/μl, 1.0 μl T7 RNA polymerase (both e.g. from Stratagene;La Jolla, Calif.) at 100 units/μl and 2.0 μl E. Coli RNase H at 4units/μl (e.g. from Gibco/Life Technologies; Gaithersburg, Md.). It iswell within the knowledge of one of skill in the art to adjust enzymevolumes as needed to account for variations in the specific activitiesof enzymes drawn from different production lots or supplied by differentmanufacturers. The reaction is incubated at 37° C. for 1 hour andstopped by freezing. While the handling of reagents varies depending onthe physical size of the array (which planar surface, if large, requirescontainment such as a tray or thermal-resistant hybridization bag ratherthan a tube), this method is of use to arnplify the molecules of anarray according to the invention.

[0156] Other methods which are of use in the amplification of moleculesof the array include, but are not limited to, nucleic acidsequence-based amplification (NASBA; Compton, 1991, Nature, 350: 91-92,incorporated herein by reference) and strand-displacement amplification(SDA; Walker et al., 1992, Nucleic Acids Res., 20: 1691-1696,incorporated herein by reference).

[0157] Step 4. Replication of the Array

[0158] a. The master plate generated in steps 1 through 3 isreplica-plated by any of a number of methods (reviewed by Lederberg,1989, Genetics, 121(3): 395-9) onto similar gel-chips. This replica isperformed by directly contacting the compressible surfaces of the twogels face to face with sufficient pressure that a few molecules of eachclone are transferred from the master to the replica. Such contact isbrief, on the order of 1 second to 2 minutes. This is done foradditional replicas from the same master, limited only by the number ofmolecules post-amplification available for transfer divided by theminimum number of molecules that must be transferred to achieve anacceptably faithful copy. While it is theoretically possible to transferas little as a single moleculeper feature, a more conservative approachis taken. The number of each species of molecule available for transfernever approaches a value so low as to raise concern about theprobability of feature loss or to the point at which a base substitutionduring replication of one member of a feature could, in subsequentrounds of amplification, create a significant (detectable) population ofmutated molecules that might be mistaken for the unaltered sequence,unless errors of those types are within the limits of tolerance for theapplication for which the array is intended. Note that differentialreplicative efficiencies of the molecules of the array are not as greata concern as they would be in the case of amplification of aconventional library, such as a phage library, in solution or on anon-covalently-bound array. Because of the physical limitations ondiffusion of molecules of any feature, one which is efficientlyamplified cannot ‘overgrow’ one which is copied less efficiently,although the density of complete molecules of the latter on the arraymay be low. It is estimated that 10 to 100 molecules per feature aresufficient to achieve fidelity during the printing process. Typically,at least 100 to 1000 molecules are transferred.

[0159] Alternatively, the plated DNA is reproduced inexpensively bymicrocontact printing, or μCP, (Jackman et al, 1995, Science, 269(5224):664-666, 1995) onto a surface with an initially uniform (or patterned)coating of two oligonucleotides (one or both immobilized by their 5′ends) suitable for in situ amplification. Pattern elements aretransferred from an elastomeric support (comparable in its physicalproperties to support materials that are useful according to theinvention) to a rigid, curved object that is rolled over it; if desired,a further, secondary transfer of the pattern elements from the rigidcylinder or other object onto a support is performed. The surface of oneor both is compliant to achieve uniform contact. For example, 30 micronthin polyacrylamide films are used for immobilizing oligomers covalentlyas well as for in situ hybridizations (Khrapko, et al., 1991, DNASequence, 1 (6):375-88). Effective contact printing is achieved with thetransfer of very few molecules of double- or single-stranded DNA fromeach sub-feature to the corresponding point on the recipient support.

[0160] b. The replicas are then amplified as in step 3.

[0161] c. Alternatively, a replica serves as a master for subsequentsteps like step 4, limited by the diffusion of the features and thedesired feature resolution.

[0162] Step 5. Identification of Features of the Array

[0163] Ideally, feature identification is performed on the first arrayof a set produced by the methods described above; however, it is alsodone using any array of a set, regardless of its position in the line ofproduction. The features are sequenced by hybridization to fluorescentlylabeled oligomers representing all sequences of a certain length (.e.g.all 4096 hexamers) as described for Sequencing-by-Hybridization (SBH,also called Sequencing-by-Hybridization-to-an-Oligonucleotide-Matrix, orSHOM; Drmanac et al., 1993, Science, 260(5114): 1649-52; Khrapko, et al.1991, supra; Mugasimangalam et al., 1997, Nucleic Acids Res., 25:800-805). The sequencing in step 5 is considerably easier thanconventional SBH if the feature lengths are short (e.g. ss-25-mersrather than the greater than ds-300-mers used in SBH), if the genomesequence is known or if a preselection of features is used.

[0164] SBH involves a strategy of overlapping block reading. It is basedon hybridization of DNA with the complete set of immobilizedoligonucleotides of a certain length fixed in specific positions on asupport. The efficiency of SBH depends on the ability to sort outeffectively perfect duplexes from those that are imperfect (i.e. containbase pair mismatches). This is achieved by comparing thetemperature-dependent dissociation curves of the duplexes formed by DNAand each of the immobilized oligonucleotides with standard dissociationcurves for perfect oligonucleotide duplexes.

[0165] To generate a hybridization and dissociation curve, a ³²P-labeledDNA fragment (30,000 cpm, 30 fmoles) in 1 μl of hybridization buffer (1MNaCl; 10 mM Na phosphate, pH 7.0; 0.5 mM EDTA) is pipetted onto a dryplate so as to cover a dot of an immobilized oligonucleotide.Hybridization is performed for 30 minutes at 0° C. The support is rinsedwith 20 ml of hybridization buffer at 0° C. and then washed 10 timeswith the same buffer, each wash being performed for 1 minute at atemperature 5° C. higher than the previous one. The remainingradioactivity is measured after each wash with a minimonitor (e.g. aMini monitor 125; Victoreen) additionally equipped with a countintegrator, through a 5 mm aperture in a lead screen. The remainingradioactivity (% of input) is plotted on a logarithmic scale againstwash temperature.

[0166] For hybridization with a fluorescently-labeled probe, a volume ofhybridization solution sufficient to cover the array is used, containingthe probe fragment at a concentration of 2 fmoles/0.01 μl. Thehybridization incubated for 5.0 hour at 17° C. and then washed at 0° C.,also in hybridization buffer. Hybridized signal is observed andphotographed with a fluorescence microscope (e.g. Leitz “Aristoplan”;input filter 510-560 nm, output filter 580 nm) equipped with aphotocamera. Using 250 ASA film, an exposure of approximately 3 minutesis taken.

[0167] For SBH, one suitable immobilization support is a 30 μm-thickpolyacrylamide gel covalently attached to glass. Oligonucleotides to beused as probes in this procedure are chemically synthesized (e.g. by thesolid-support phosphoramidite method, deprotected in ammonium hydroxidefor 12 h at 55° C. and purified by PAGE under denaturing conditions).Prior to use, primers are labeled either at the 5′-end with [γ-³²P]ATP,using T4 polynucleotide kinase, to a specific activity of about 1000cpm/fmol, or at the 3′-end with a fluorescent label, e.g.tetramethylrhodamine (TMR), coupled to dUTP through the base by terminaltransferase (Aleksandrova et al., 1990, Molek. Biologia [Moscow], 24:1100-1108) and further purified by PAGE.

[0168] An alternative method of sequencing involves subsequent rounds ofstepwise ligation and cleavage of a labeled probe to a targetpolynucleotide whose sequence is to be determined (Brenner, U.S. Pat.No. 5,599,675). According to this method, the nucleic acid to besequenced is prepared as a double-stranded DNA molecule with a “stickyend,” in other words, a single-stranded terminal overhang, whichoverhang is of a known length that is uniform among the molecules of thepreparation, typically 4 to 6 bases. These molecules are then probed inorder to determine the identity of a particular base present in thesingle-stranded region, typically the terminal base. A probe of use inthis method is a double-stranded polynucleotide which (i) contains arecognition site for a nuclease, and (ii) typically has a protrudingstrand capable of forming a duplex with a complementary protrudingstrand of the target polynucleotide. In each sequencing cycle, onlythose probes whose protruding strands form perfectly-matched duplexeswith the protruding strand of the target polynucleotide hybridize- andare then ligated to the end of the target polynucleotide. The probemolecules are divided into four populations, wherein each suchpopulation comprises one of the four possible nucleotides at theposition to be determined, each labeled with a distinct fluorescent dye.The remaining positions of the duplex-forming region are occupied withrandomized, unlabeled bases, so that every possible multimer the lengthof that region is represented; therefore, a certain percentage of probemolecules in each pool are complementary to the single-stranded regionof the target polynucleotide; however, only one pool bears labeled probemolecules that will hybridize.

[0169] After removal of the unligated probe, a nuclease recognizing theprobe cuts the ligated complex at a site one or more nucleotides fromthe ligation site along the target polynucleotide leaving an end,usually a protruding strand, capable of participating in the next cycleof ligation and cleavage. An important feature of the nuclease is thatits recognition site be separate from its cleavage site. In the courseof such cycles of ligation and cleavage, the terminal nucleotides of thetarget polynucleotide are identified. As stated above, one such categoryof enzyme is that of type IIs restriction enzymes, which cleave sites upto 20 base pairs remote from their recognition sites; it is contemplatedthat such enzymes may exist which cleave at distances of up to 30 basepairs from their recognition sites.

[0170] Ideally, it is the terminal base whose identity is beingdetermined (in which it is the base closest to the double-strandedregion of the probe which is labeled), and only this base is cleavedaway by the type Ils enzyme. The cleaved probe molecules are recovered(e.g. by hybridization to a complementary sequence immobilized on a beador other support matrix) and their fluorescent emission spectrummeasured using a fluorimeter or other light-gathering device. Note thatfluorimetric analysis may be made prior to cleavage of the probe fromthe test molecule; however, cleavage prior to qualitative analysis offluorescence allows the next round of sequencing to commence whiledetermination of the identity of the first sequenced base is inprogress. Detection prior to cleavage is preferred where sequencing iscarried out in parallel on a plurality of sequences (either segments ofa single target polynucleotide or a plurality of altogether differenttarget polynucleotides), e.g. attached to separate magnetic beads, orother types of solid phase supports, such as the replicable arrays ofthe invention. Note that whenever natural protein endonucleases areemployed as the nuclease, the method further includes a step ofmethylating the target polynucleotide at the start of a sequencingoperation to prevent spurious cleavages at internal recognition sitesfortuitously located in the target polynucleotide.

[0171] By this method, there is no requirement for the electrophoreticseparation of closely-sized DNA fragments, for difficult-to-automategel-based separations, or the generation of nested deletions of thetarget polynucleotide. In addition, detection and analysis are greatlysimplified because signal-to noise ratios are much more favorable on anucleotide-by-nucleotide basis, permitting smaller sample sizes to beemployed. For fluorescent-based detection schemes, analysis is furthersimplified because fluorophores labeling different nucleotides may beseparately detected in homogeneous solutions rather than in spatiallyoverlapping bands.

[0172] As alluded to, the target polynucleotide may be anchored to asolid-phase support, such as a magnetic particle, polymeric microsphere,filter material, or the like, which permits the sequential applicationof reagents without complicated and time-consuming purification steps.The length of the target polynucleotide can vary widely; however, forconvenience of preparation, lengths employed in conventional sequencingare preferred. For example, lengths in the range of a few hundredbasepairs, 200-300, to 1 to 2 kilobase pairs are most often used.

[0173] Probes of use in the procedure may be labeled in a variety ofways, including the direct or indirect attachment of radioactivemoieties, fluorescent moieties, calorimetric moieties, and the like.Many comprehensive reviews of methodologies for labeling DNA andconstructing DNA probes provide guidance applicable to constructingprobes (see Matthews et al., 1988, Anal. Biochem., 169: 1-25; Haugland,1992, Handbook of Fluorescent Probes and Research Chemicals, MolecularProbes, Inc., Eugene, Oreg.; Keller and Manak, 1993, DNA Probes. 2ndEd., Stockton Press, New York; Eckstein, ed., 1991, Oligonucleotides andAnalogues: A Practical Approach, ML Press, Oxford, 1991); Wetmur, 1991,Critical Reviews in Biochemistry and Molecular Biology, 26: 227-259).Many more particular labelling methodologies are known in the art (seeConnolly, 1987, Nucleic Acids Res., 15: 3131-3139; Gibson et al. 1987,Nucleic Acids Res., 15: 5455-6467; Spoat et al., 1987, Nucleic AcidsRes., 15: 4837-4848; Fung et al., U.S. Pat. No. 4,757,141; Hobbs, etal., U.S. Pat. No. 5,151,507; Cruickshank, U.S. Pat. No. 5,091,519;[synthesis of functionalized oligonucleotides for attachment of reportergroups]; Jablonski et al., 1986, Nucleic Acids Res., 14: 6115-6128[enzyme/oligonucleotide conjugates]; and Urdea et al., U.S. Pat. No.5,124,246 [branched DNA]). The choice of attachment sites of labelingmoieties does not significantly affect the ability of a given labeledprobe to identify nucleotides in the target polynucleotide, providedthat such labels do not interfere with the ligation and cleavage steps.In particular, dyes may be conveniently attached to the end of the probedistal to the target polynucleotide on either the 3′ or 5′ termini ofstrands making up the probe, e.g. Eckstein (cited above), Fung (citedabove), and the like. In some cases, attaching labeling moieties tointerior bases or inter-nucleoside linkages may be desirable.

[0174] As stated above, four sets of mixed probes are provided foraddition to the target polynucleotide, where each is labeled with adistinguishable label. Typically, the probes are labeled with one ormore fluorescent dyes, e.g. as disclosed by Menchen et al, U.S. Pat No.5,188,934; Begot et al PCT application PCT/US90/05565. Each of fourspectrally resolvable fluorescent labels may be attached, for example,by way of Aminolinker II (all available from Applied Biosystems, Inc.,Foster City, Calif.); these include TAMRA (tetramethylrhodamine), FAM(fluorescein), ROX (rhodamine X), and JOE (2′,7′-dimethoxy-4′,5′-dichlorofluorescein) and their attachment tooligonucleotides is described in Fung et al., U.S. Pat. No. 4,855,225.

[0175] Typically, nucleases employed in the invention are naturalprotein endonucleases (i) whose recognition site is separate from itscleavage site and (ii) whose cleavage results in a protruding strand onthe target polynucleotide. Class IIS restriction endonucleases that maybe employed are as previously described (Szybalski et al., 1991, Gene,100: 13-26; Roberts et al., 1993, Nucleic Acids Res., 21: 3125-3137;Livak and Brenner, U.S. Pat No. 5,093,245). Exemplary class IIsnucleases include AlwXI, BsmAI, BbvI, BsmFI, SisI, HgaI, BscAI, BbvII,BcefI, Bce85I, BccI, BcgI, BsaI, BsgI, BspMI, Bst71 I, Ear1, Eco57I,Esp3I, FauI, FokI, GsuI, HphI, MboII, MmeI, RleAI, SapI, SfaNI, TaqII,Tth111II, Bco5I, BpuAI, FinI, BsrDI, and isoschizomers thereof.Preferred nucleases include Fok1, HgaI, EarI, and SfaNI. Reactions aregenerally carried out in 50 μL volumes of manufacturer's (New EnglandBiolabs) recommended buffers for the enzymes employed, unless otherwiseindicated. Standard buffers are also described in Sambrook et al., 1989,supra.

[0176] When conventional ligases are employed, the 5′ end of the probemay be phosphorylated. A 5′ monophosphate can be attached to a secondoligonucleotide either chemically or enzymatically with a kinase (seeSambrook et al., 1989, supra). Chemical phosphorylation is described byHorn and Urdea, 1986, Tetrahedron Lett., 27: 4705, and reagents forcarrying out the disclosed protocols are commercially available (e.g. 51Phosphate-ONTm from Clontech Laboratories; Palo Alto, Calif.).

[0177] Chemical ligation methods are well known in the art, e.g. Ferriset al., 1989, Nucleosides & Nucleotides, 8: 407-414; Shabarova et al.,1991, Nucleic Acids Res., 19: 4247-4251. Typically, ligation is carriedout enzymatically using a ligase in a standard protocol. Many ligasesare known and are suitable for use in the invention (Lehman, 1974,Science, 186: 790-797; Engler et al., 1982, “DNA Ligases,” in Boyer,ed., The Enzymes Vol. 15B pp. 3-30, Academic Press, New York). Preferredligases include T4 DNA ligase, T7 DNA ligase, E. coli DNA ligase, Taqligase, Pfu ligase and Tth ligase. Protocols for their use are wellknown, (e.g. Sambrook et al., 1989, supra; Barany, 1991, PCR Methods andApplications, 1: 5-16; Marsh et al., 1992, Strategies, 5: 73-76).Generally, ligases require that a 5′ phosphate group be present forligation to the 3′ hydroxyl of an abutting strand. This is convenientlyprovided for at least one strand of the target polynucleotide byselecting a nuclease which leaves a 5′ phosphate, e.g. FokI.

[0178] Prior to nuclease cleavage steps, usually at the start of asequencing operation, the target polynucleotide is treated to block therecognition sites and/or cleavage sites of the nuclease being employed.This prevents undesired cleavage of the target polynucleotide because ofthe fortuitous occurrence of nuclease recognition sites at interiorlocations in the target polynucleotide. Blocking can be achieved in avariety of ways, including methylation and treatment bysequence-specific aptamers, DNA binding proteins, or oligonucleotidesthat form triplexes. Whenever natural protein endonucleases areemployed, recognition sites can be conveniently blocked by methylatingthe target polynucleotide with the so-called “cognate” methylase of thenuclease being used; for most (if not all) type II bacterial restrictionendonucleases, there exist cognate methylases that methylate theircorresponding recognition sites. Many such methylases are known in theart (Roberts et al., 1993, supra; Nelson et al., 1993, Nucleic AcidsRes., 21: 3139-3154) and are commercially available from a variety ofsources, particularly New England Biolabs (Beverly, Mass.).

[0179] The method includes an optional capping step after the unligatedprobe is washed from the target polynucleotide. In a capping step, byanalogy with polynucleotide synthesis (e.g. Andrus et al., U.S. Pat. No.4,816,571), target polynucleotides that have not undergone ligation to aprobe are rendered inert to further ligation steps in subsequent cycles.In this manner spurious signals from “out of phase” cleavages areprevented. When a nuclease leaves a 5′ protruding strand on the targetpolynucleotides, capping is usually accomplished by exposing theunreacted target polynucleotides to a mixture of the fourdideoxynucleoside triphosphates, or other chain-terminating nucleosidetriphosphates, and a DNA polymerase. The DNA polymerase extends the Ystrand of the unreacted target polynucleotide by one chain-terminatingnucleotide, e.g. a dideoxynucleotide, thereby rendering it incapable ofligating with probe in subsequent cycles.

[0180] Alternatively, a simple method involving quantitative incrementalfluorescent nucleotide addition sequencing (QIFNAS), is employed inwhich each end of each clonal oligonucleotide is sequenced by primerextension with a nucleic acid polymerase (e.g. Klenow or Sequenase™;U.S. Biochemicals) and one nucleotide at a time which has a traceablelevel of the corresponding fluorescent dNTP or rNTP, for exarnple, 100micromolar dCTP and 1 micromolar fluorescein-dCTP. This is donesequentially, e.g. dATP, dCTP, dGTP, dTTP, dATP and so forth until theincremental change in fluorescence is below a percentage that isadequate for useful discrimination from the cumulative total fromprevious cycles. The length of the sequence so determined may beextended by any of periodic photobleaching or cleavage of theaccumulated fluorescent label from nascent nucleic acid molecules ordenaturing the nascent nucleic acid strands from the array andre-priming the synthesis using sequence already obtained.

[0181] After features are identified on a first array of the set, it isdesirable to provide landmarks by which subsequently-produced arrays ofthe set are aligned with it, thereby enabling workers to locate on themfeatures of interest. This is important, as the first array of a setproduced by the method of the invention is, by nature, random, in thatthe nucleic acid molecules of the starting pool are not placed down in aspecific or pre-ordered pattern based upon knowledge of their sequences.

[0182] Several types of markings are made according to the technologyavailable in the art. For instance, selected features are removed bylaser ablation (Matsuda and Chung 1994, ASAIO Journal, 40(3): M594-7;Jay, 1988, Proc. Natl. Acad. Sci. U.S.A., 85: 5454-5458; Kimble, 1981,Dev. Biol., 87(2): 286-300) or selectively replicated on copies of anarray by laser-enhanced adhesion (Emmert-Buck et al, 1996, Science,274(5289): 998-1001). These methods are used to eliminate nucleic acidfeatures that interfere with adjacent features or to create a patternthat is easier for software to align.

[0183] Laser ablation is carried out as follows: A KrF excimer laser,e.g. a Hamamatsu L4500 (Hamamatsu, Japan) (pulse wavelength, 248 nm;pulse width, 20 ns) is used as the light source. The laser beam isconverged through a laser-grade UV quartz condenser lens to yieldmaximum fluences of 3.08 J/cm² per pulse. Ablation of the matrix andunderlying glass surface is achieved by this method. The depth ofetching into the glass surfaces is determined using real-time scanninglaser microscopy (Lasertec 1LM21W, Yokohama, Japan), and a depth profileis determined.

[0184] Selective transfer of features via laser-capture microdissectionproceeds as follows: A flat film (100 μm thick) is made by spreading amolten thermoplastic material e.g. ethylene vinyl acetate polymer (EVA;Adhesive Technologies; Hampton, N.H.) on a smooth silicone orpolytetrafluoroethylene surface. The optically-transparent thin film isplaced on top of an array of the invention, and the array/film sandwichis viewed in an inverted microscope (e.g. and Olympus Model CK2; Tokyo)at 100× magnification (10× objective). A pulsed carbon dioxide laserbeam is introduced by way of a small front-surface mirror coaxial withthe condenser optical path, so as to irradiate the upper surface of theEVA film. The carbon dioxide laser (either Apollo Company model 580, LosAngeles, or California Laser Company model LS150, San Marcos, Calif.)provides individual energy pulses of adjustable length and power. A ZnSelens focuses the laser beam to a target of adjustable spot size on thearray. For transfer spots of 150 μm diameter, a 600-microsecond pulsedelivers 25-30 mW to the film. The power is decreased or increasedapproximately in proportion to the diameter of the laser spot focused onthe array. The absorption coefficient of the EVA film, measured byFourier transmission, is 200 cm⁻¹ at a laser wavelength of 10.6 μm.Because >90% of the laser radiation is absorbed within the thermoplasticfilm, little direct heating occurs. The glass plate or chip upon whichthe semi-solid support has been deposited provides a heat sink thatconfines the full-thickness transient focal melting of the thermoplasticmaterial to the targeted region of the array. The focally-molten plasticmoistens the targeted tissue. After cooling and recrystallization, thefilm forms a local surface bond to the targeted nucleic acid moleculesthat is stronger than the adhesion forces that mediate their affinityfor the semi-solid support medium. The film and targeted nucleic acidsare removed from the array, resulting in focal microtransfer of thetargeted nucleic acids to the film surface.

[0185] If removal of molecules from the array by this method isperformed for the purpose of ablation, the procedure is complete. Ifdesired, these molecules instead are amplified and cloned out, asdescribed in Example 7.

[0186] A method provided by the invention for the easy orientation ofthe nucleic acid molecules of a set of arrays relative to one another is“array templating.” A homogeneous solution of an initial library ofsingle-stranded DNA molecules is spread over a photolithographicall-10-mer ss-DNA oligomer array under conditions which allow sequencescomprised by library members to become hybridized to member molecules ofthe array, forming an arrayed library where the coordinates are in orderof sequence as defined by the array. For example, a 3′-immobilized10-mer (upper strand), binds a 25-mer library member (lower strand) asshown below:                 5′-TGCATGCTAT-3′ [SEQ ID NO: 2]3′-CGATGCATTTACGTAACGTACGATA-5′ [SEQ ID NO: 3]

[0187] Covalent linkage of the 25-mer sequence to the support,amplification and replica printing are performed by any of the methodsdescribed above. Further characterization, if required, is carried outby SBH, fluorescent dNTP extension or any other sequencing methodapplicable to nucleic acid arrays, such as are known in the art. Thisgreatly enhances the ability to identify the sequence of a sufficientnumber of oligomer features in the replicated array to make the arrayuseful in subsequent applications.

EXAMPLE 2

[0188] Ordered Chromosomal Arrays According to the Invention

[0189] Direct in situ single-copy (DISC)-PCR is a method that uses twoprimers that define unique sequences for on-slide PCR directly onmetaphase chromosomes (Troyer et al., 1994a, Mammalian Genome, 5:112-114; summarized by Troyer et al., 1997, Methods Mol. Biol., Vol. 71:PRINS and In Situ PCR Protocols, J. R. Godsen, ed., Humana Press, Inc.,Totowa, N.J., pp. 71-76). It thus allows exponential accumulation of PCRproduct at specific sites, and so may be adapted for use according tothe invention.

[0190] The DISC-PCR procedure has been used to localize sequences asshort as 100-300 bp to mammalian chromosomes (Troyer et al., 1994a,supra; Troyer et al., 1994b, Cytogenet. Cell Genetics, 67(3), 199-204;Troyer et al., 1995, Anim. Biotechnology, 6(1): 51-58; and Xie et al.,1995, Mammalian Genome 6: 139-141). It is particularly suited forphysically assigning sequence tagged sites (STSs), such asmicrosatellites (Litt and Luty, 1989, Am. J. Hum. Genet. 44: 397-401;Weber and May, 1989, Am. J. Hum. Genet 44, 338-396), many of whichcannot be assigned by in situ hybridization because they have beenisolated from small-insert libraries for rapid sequencing. It can alsobe utilized to map expressed sequence tags (ESTs) physically (Troyer,1994a, supra; Schmutz et al., 1996, Cytogenet. Cell Genetics, 72:37-39). DISC-PCR obviates the necessity for an investigator to have acloned gene in hand, since all that is necessary is to have enoughsequence information to synthesize PCR primers. By the methods of theinvention, target-specific primers need not even be utilized; all thatis required is a mixed pool of primers whose members have at one end a‘universal’ sequence, suitable for manipulations such as restrictionendonuclease cleavage or hybridization to oligonucleotide moleculesimmobilized on- or added to a semi-solid support and, at the other end,an assortment of random sequences (for example, every possible hexamer)which will prime in situ amplification of the chromosome. As describedabove, the primers may include terminal crosslinking groups with whichthey may be attached to the semi-solid support of the array followingtransfer; alternatively, they may lack such an element, and beimmobilized to the support either through ultraviolet crosslinking orthrough hybridization to complementary, immobilized primers andsubsequent primer extension, such that the newly-synthesized strandbecomes permanently bound to the array. The DISC-PCR procedure issummarized briefly as follows:

[0191] Metaphase chromosomes anchored to glass slides are prepared bystandard techniques (Halnan, 1989, in Cytogenetics of Animals, C.R.E.Halnan, ed., CAB International, Wallingford, U.K., pp. 451-456; ), usingslides that have been pre-rinsed in ethanol and dried using lint-freegauze. Slides bearing chromosome spreads are washed inphosphate-buffered saline (PBS; 8.0 g NaCl, 1.3 g Na₂HPO₄ and 4 gNaH₂PO₄ dissolved in deionized water, adjusted to a volume of 1 literand pH of 7.4) for 10 min and dehydrated through an ethanol series (70-,80-, 95-, and 100%). Note that in some cases, overnight fixation ofchromosomes in neutral-buffered formalin followed by digestion for 15minutes with pepsinogen (2 mg/ml; Sigma) improves amplificationefficiency.

[0192] For each slide, the following solution is prepared in a microfugetube: 200 μM each dATP, dCTP, dGTP and dTTP; all deoxynucleotides aremaintained as frozen, buffered 10 mM stock solutions or in dry form, andmay be obtained either in dry or in solution from numerous suppliers(e.g. Perkin Elmer, Norwalk, Conn.; Sigma, St. Louis, Mo.; Pharmacia,Uppsala, Sweden). The reaction mixture for each slide includes 1.5 μMeach primer (from 20 μM stocks), 2.0 μL 10× Taq polymerase buffer (100mM Tris-HCl, pH 8.3, 500 mM KCl, 15 mM MgCl₂, 0.1% BSA; Perkin Elmer),2.5 units AmpliTaq polymerase (Perkin Elmer) and deionzed H₂O to a finalvolume of 20 μl. Note that the commercially supplied Taq polymerasebuffer is normally adequate; however, adjustments may be made as neededin [MgCl₂] or pH, in which case an optimization kit, such as theOpti-Primer PCR Kit (Stratagene; La Jolla, Calif.) may be used. Theabove reaction mixture is pipetted onto the metaphase chromosomes andcovered with a 22×50 mm coverslip, the perimeter of which is then sealedwith clear nail polish. All air bubbles, even the smallest, are removedprior to sealing, as they expand when heated, and will inhibit thereaction. A particularly preferred polish is Hard As Nails (SallyHansen); this nail enamel has been found to be resistant to leakage,which, if it occurred, would also compromise the integrity of thereaction conditions and inhibit amplification of the chromosomal DNAsequences. One heavy coat is sufficient. After the polish has beenallowed to dry at room temperature, the edges of the slide are coveredwith silicone grease (Dow Corning Corporation, Midland, Mich.). Slidesare processed in a suitable thermal cycler (i.e. one designed foron-slide PCR, such as the BioOven III; Biotherm Corp., Fairfax, Va.)using the following profile:

[0193] a. 94° C. for 3 min.

[0194] b. Annealing temperature of primers for 1 min.

[0195] c. 72° C. for 1 min.

[0196] d. 92° C. for 1 min.

[0197] e. Cycle to step b 24 more times (25 cycles total).

[0198] f. Final extension step of 3-5 min.

[0199] After thermal cycling is complete, silicone grease is removedwith a tissue, and the slide is immersed in 100% ethanol. Using a sharprazor blade, the nail polish is cut through and the edge of thecoverslip is lifted gently and removed. It is critical that the slidenever be allowed to dry from this point on, although excess buffer isblotted gently off of the slide edge. The slide is immersed quickly in4× SSC and excess nail polish is scraped from the edges of the slideprior to subsequent use.

[0200] The slide is contacted immediately with a semi-solid support inorder to transfer to it the amplified nucleic acid molecules;alternatively, that the slide is first equilibrated in a liquid mediumthat is isotonic with- or, ideally, identical to that which permeates(i.e. is present in the pores of-) the semi-solid support matrix. Fromthat point on, the array is handled comparably with those preparedaccording to the methods presented in Example 1. Feature identification,also as described above, permits determination of the approximatepositions of genetic elements along the length of the templatechromosome. In preparations in which chromosomes are linearly extended(stretched), the accuracy of gene ordering is enhanced. This isparticularly useful in instances in which such information is not known,either through classical or molecular genetic studies, even in theextreme case of a chromosome that is entirely uncharacterized. By thismethod, comparative studies of homologous chromosomes between species ofinterest are performed, even if no previous genetic mapping has beenperformed on either. The information so gained is valuable in terms ofgauging the evolutionary relationships between species, in that bothlarge and small chromosomal rearrangements are revealed. The geneticbasis of phenotypic differences between different individuals of asingle species, e.g. human subjects, is also investigated by thismethod. When template chromosomes are condensed (coiled), moreinformation is gained regarding the in vivo spatial relationships amonggenetic elements. This may have implications in terms of cell-typespecific gene transcriptional-activity, upon which comparison of arraysgenerated from samples comprising condensed chromosomes drawn from cellsof different tissues of the same organism may shed light.

[0201] While the methods by which histological samples are prepared, PCRis performed and the first copy of the chromosomal array is generatedare time-consuming, multiple copies of the array are produced easilyaccording to the invention, as described above in Example 1 andelsewhere. The ability of the invention to reproduce what would,otherwise, be a unique array provides a valuable tool by whichscientists have the power to work in parallel- or perform analyses ofdifferent types upon comparable samples. In addition, it allows for thegeneration of still more copies of the array for distribution to anynumber of other workers who may desire to confirm or extend any data setderived from such an array at any time.

[0202] A variation on this use of the present invention is chromosometemplating. DNA (e.g. that of a whole chromosome) is stretched out andfixed on a surface (Zimmermann and Cox, 1994, Nucleic Acids Res., 22(3):492-497). Segments of such immobilized DNA are made single-stranded byexonucleases, chemical denaturants (e.g. formamide) and/or heat. Thesingle stranded regions are hybridized to the variable portions of anarray of single-stranded DNA molecules each bearing regions ofrandomized sequence, thereby forming an array where the coordinates offeatures correspond to their order on a linear extended chromosome.Alternatively, a less extended structure, which replicates the folded orpartially-unfolded state of various nucleic acid compartments in a cell,is made by using a condensed (coiled), rather than stretched,chromosome.

EXAMPLE 3

[0203] RNA Localization Arrays

[0204] The methods described in Example 2, above, are applied with equalsuccess to the generation of an array that provides a two-dimensionalrepresentation of the spatial distribution of the RNA molecules of acell. This method is applied to ‘squashed’ cellular material, preparedas per the chromosomal spreads described above in Example 2;alternatively, sectioned tissue samples affixed to glass surfaces areused. Either paraffin-, plastic- or frozen (Serrano et al., 1989, Dev.Biol. 132: 410-418) sections are used in the latter case.

[0205] Tissue samples are fixed using conventional reagents; formalin,4% paraformaldehyde in an isotonic buffer, formaldehyde (each of whichconfers a measure of RNAase resistance to the nucleic acid molecules ofthe sample) or a multi-component fixative, such as FAAG (85% ethanol, 4%formaldehyde, 5% acetic acid, 1% EM grade glutaraldehyde) is adequatefor this procedure. Note that water used in the preparation of anyaqueous components of solutions to which the tissue is exposed until itis embedded is RNAase-free, i.e. treated with 0.1% diethylprocarbonate(DEPC) at room temperature overnight and subsequently autoclaved for 1.5to 2 hours. Tissue is fixed at 4° C., either on a sample roller or arocking platform, for 12 to 48 hours in order to allow fixative to reachthe center of the sample. Prior to embedding, samples are purged offixative and dehydrated; this is accomplished through a series of two-to ten-minute washes in increasingly high concentrations of ethanol,beginning at 60%- and ending with two washes in 95%- and another two in100% ethanol, followed two ten-minute washes in xylene. Samples areembedded in any of a variety of sectioning supports, e.g. paraffin,plastic polymers or a mixed paraffin/polymer medium (e.g. Paraplast®PlusTissue Embedding Medium, supplied by Oxford Labware). For example,fixed, dehydrated tissue is transferred from the second xylene wash toparaffin or a paraffin/polymer resin in the liquid-phase at about 58°C., then replace three to six times over a period of approximately threehours to dilute out residual xylene, followed by overnight incubation at58° C. under a vacuum, in order to optimize infiltration of theembedding medium in to the tissue. The next day, following several morechanges of medium at 20 minute to one hour intervals, also at 58° C.,the tissue sample is positioned in a sectioning mold, the mold issurrounded by ice water and the medium is allowed to harden. Sections of6 μm thickness are taken and affixed to ‘subbed’ slides, which are thosecoated with a proteinaceous substrate material, usually bovine serumalbumin (BSA), to promote adhesion. Other methods of fixation andembedding are also applicable for use according to the methods of theinvention; examples of these are found in Humason, G. L., 1979, AnimalTissue Techniques 4th ed. (W. H. Freeman & Co., San Francisco), as isfrozen sectioning.

[0206] Following preparation of either squashed or sectioned tissue, theRNA molecules of the sample are reverse-transcribed in situ. In order tocontain the reaction on the slide, tissue sections are placed on a slidethermal cycler (e.g. Tempcycler II; COY Corp., Grass Lake, Mich.) withheating blocks designed to accommodate glass microscope slides.Stainless steel or glass (Bellco Glass Inc.; Vineland, N.J.) tissueculture cloning rings approximately 0.8 cm (inner diameter)×1.0 cm inheight are placed on top of the tissue section. Clear nail polish isused to seal the bottom of the ring to the tissue section, forming avessel for the reverse transcription and subsequent localized in situamplification (LISA) reaction (Tsongalis et al., 1994, supra).

[0207] Reverse transcription is carried out using reverse transcriptase,(e.g. avian myoblastosis virus reverse transcriptase, AMV-RT; LifeTechnologies/Gibco-BRL or Moloney Murine Leukemia Virus reversetranscriptase, M-MLV-RT, New England Biolabs, Beverly, Mass.) under themanufacturer's recommended reaction conditions. For example, the tissuesample is rehydrated in the reverse transcription reaction mix, minusenzyme, which contains 50 mM Tris·HCl (pH 8.3), 8 mM MgCl₂, 10 mMdithiothreitol, 1.0 mM each dATP, dTTP, dCTP and dGTP and 0.4 mMoligo-dT (12- to 18-mers). The tissue sample is, optionally, rehydratedin RNAase-free TE (10 mM Tris·HCl, pH 8.3 and 1 mM EDTA), then drainedthoroughly prior to addition of the reaction buffer. To denature the RNAmolecules, which may have formed some double-stranded secondarystructures, and to facilitate primer annealing, the slide is heated to65° C. for 1 minute, after which it is cooled rapidly to 37° C. After 2minutes, 500 units of M-MLV-RT are added the mixture, bringing the totalreaction volume to 100 μl. The reaction is incubated at 37° C. for onehour, with the reaction vessel covered by a microscope cover slip toprevent evaporation.

[0208] Following reverse transcription, reagents are pipetted out of thecontainment ring structure, which is rinsed thoroughly with TE buffer inpreparation for amplification of the resulting cDNA molecules.

[0209] The amplification reaction is performed in a total volume of 25μl, which consists of 75 ng of both the forward and reverse primers (forexample the mixed primer pools 1 and 2 of Example 6) and 0.6 U of Taqpolymerase in a reaction solution containing, per liter: 200 nmol ofeach deoxynucleotide triphosphate, 1.5 mmol of MgCl₂, 67 mmol ofTris·HCl (pH 8.8), 10 mmol of 2-mercaptoethanol, 16.6 mmol of ammoniumsulfate, 6.7 μmol of EDTA, and 10 μmol of digoxigenin-11-dUTP. Thereaction mixture is added to the center of the cloning ring, and layeredover with mineral oil to prevent evaporation before slides are placedback onto the slide thermal cycler. DNA is denatured in situ at 94° C.for 2 min prior to amplification. LISA is accomplished by using 20cycles, each consisting of a 1-minute primer annealing step (55° C.), a1.5-min extension step (72° C.), and a 1-min denaturation step (94° C.).These amplification cycle profiles differ from those used in tubeamplification to preserve optimal tissue morphology, hence thedistribution of reverse transcripts and the products of theiramplification on the slide.

[0210] Following amplification, the oil layer and reaction mix areremoved from the tissue sample, which is then rinsed with xylene. Thecontainment ring is removed with acetone, and the tissue containing theamplified cDNA is rehydrated by washing three times in approximately 0.5ml of a buffer containing 100 mM Tris-Cl (pH 7.5) and 150 mM NaCl. Theimmobilized nucleic acid array of the invention is then formed bycontacting the amplified nucleic acid molecules with a semi-solidsupport and covalently crosslinking them to it, by any of the methodsdescribed above.

[0211] Features are identified using SBH, also as described above, andcorrelated with the positions of mRNA molecules in the cell.

EXAMPLE 4

[0212] Size-Sorted Genomic Arrays

[0213] As mentioned above, it is possible to prepare a support matrix inwhich are embedded whole, even living, cells. Such protocols have beendeveloped for various purposes, such as encapsulated, implantablecell-based drug-delivery vehicles, and the delivery to an electophoreticmatrix of very large, unsheared DNA molecules, as required forpulsed-field gel electrophoresis (Schwartz and Cantor, 1984, Cell, 37:67-75). The arrays of the invention are constructed using as thestarting material genomic DNA from a cell of an organism that has beenembedded in an electrophoretic matrix and lysed in situ, such thatintact nucleic acid molecules are released into the support matrixenvironment. If an array based upon copies of large molecules is made,such as is of use in a fashion similar to the chromosomal elementordering arrays described above in Example 2, then a low-percentageagarose gel is used as a support. Following lysis (Schwartz and Cantor,1984, supra), the resulting large molecules may be size-sortedelectrophoretically prior to in situ PCR amplification and linkage tothe support, both as described above. If it is desired to preserve thearray on a support other than agarose, which may be difficult to handleif the gel is large, the array is transferred via electroblotting onto asecond support, such as a nylon or nitrocellulose membrane prior tolinkage.

[0214] If it is not considered essential to preserve the associationsbetween members of genetic linkage groups (at the coarsest level ofresolution, chromosomes), nucleic acid molecules are cleaved,mechanically, chemically or enzymatically, prior to electrophoresis. Amore even distribution of nucleic acid over the support results, andphysical separation of individual elements from one another is improved.In such a case, a polyacrylamide, rather than agarose, gel matrix isused as a support. The arrays produced by this method do, to a certainextent, resemble sequencing gels; cleavage of an electrophoresed array,e.g. with a second restriction enzyme or combination thereof, followedby electrophoresis in a second dimension improves resolution ofindividual nucleic acid sequences from one another.

[0215] Such an array is constructed to any desired size. It is nowfeasible to scan large gels (for example, 40 cm in length) at highresolution. In addition, advances in gel technology now permitsequencing to be performed on gels a mere 4 cm long, one tenth the usuallength, which demonstrates that a small gel is also useful according tothe invention.

EXAMPLE 5

[0216] Spray-Painted Arrays (Inkjet)

[0217] Immobilized nucleic acid molecules may, if desired, be producedusing a device (e.g., any commercially-available inkjet printer, whichmay be used in substantially unmodified form) which sprays a focusedburst of nucleic acid synthesis compounds onto a support (seeCastellino, 1997, Genome Res., 7: 943-976). Such a method is currentlyin practice at Incyte Pharmaceuticals and Rosetta Biosystems, Inc., thelatter of which employs what are said to be minimally-modified Epsoninkjet cartridges (Epson America, Inc.; Torrance, Calif.). The method ofinkjet deposition depends upon the piezoelectric effect, whereby anarrow tube containing a liquid of interest (in this case,oligonucleotide synthesis reagents) is encircled by an adapter. Anelectric charge sent across the adapter causes the adapter to expand ata different rate than the tube, and forces a small drop of liquidcontaining phosphoramidite chemistry reagents from the tube onto acoated slide or other support.

[0218] Reagents are deposited onto a discrete region of the support,such that each region forms a feature of the array; the desired nucleicacid sequence is synthesized drop-by-drop at each position, as is truein other methods known in the art. If the angle of dispersion ofreagents is narrow, it is possible to create an array comprising manyfeatures. Alternatively, if the spraying device is more broadly focused,such that it disperses nucleic acid synthesis reagents in a wider angle,as much as an entire support is covered each time, and an array isproduced in which each member has the same sequence (i.e. the array hasonly a single feature).

[0219] Arrays of both types are of use in the invention; a multi-featurearray produced by the inkjet method is used in array templating, asdescribed above; a random library of nucleic acid molecules are spreadupon such an array as a homogeneous solution comprising a mixed pool ofnucleic acid molecules, by contacting the array with a tissue samplecomprising nucleic acid molecules, or by contacting the array withanother array, such as a chromosomal array (Example 2) or an RNAlocalization array (Example 3).

[0220] Alternatively, a single-feature array produced by the inkjetmethod is used by the same methods to immobilize nucleic acid moleculesof a library which comprise a common sequence, whether anaturally-occurring sequence of interest (e.g. a regulatory motif) or anoligonucleotide primer sequence comprised by all or a subset of librarymembers, as described herein above and in Example 6, below.

[0221] Nucleic acid molecules which thereby are immobilized upon anordered inkjet array (whether such an array comprises one or a pluralityof oligonucleotide features) are amplified in situ, transferred to asemi-solid support and immobilized thereon to form a firstrandomly-patterned, immobilized nucleic acid array, which issubsequently used as a template with which to produce a set of sucharrays according to the invention, all as described above.

EXAMPLE 6

[0222] Isolation of a Feature from an Array of the Invention (Method 1)/Heterologous Arrays

[0223] As described above in Example 1, sets of arrays are, if desired,produced according to the invention such that they incorporateoligonucleotide sequences bearing restriction sites linked to the endsof each feature. This provides a method for creating spatially-uniquearrays of primer pairs for in situ amplification, in which each featurehas a distinct set of primer pairs. One or both of the universal primerscomprises a restriction endonuclease recognition site, such as a typeIIS sequence (e.g. as Eco57I or MmeI which will cut up to 20 bp away).Treatment of the whole double-stranded array with the correspondingenzyme(s) followed by melting and washing away the non-immobilizedstrand creates the desired primer pairs with well-defined 3′ ends.Alternatively, a double:strand-specific 3′ exonuclease treatment of thedouble-stranded array is employed, but the resulting single-stranded 3′ends may vary in exact endpoint. The 3′ end of the primers are used forin situ amplification, for example of variant sequences in diagnostics.This method, by which arrays of unique primer pairs are producedefficiently, provides an advance over the method of Adams and Kron(1997, supra), in which each single pair of primers is manuallyconstructed and placed. Cloning of a given feature from an array of sucha set is performed as follows:

[0224] MmeI is a restriction endonuclease having the property ofcleaving at a site remote from its recognition site, TCCGAC.Heterogeneous pools of primers are constructed that comprise (from 5′ to3′) a sequence shared by all members of the pool, the MmeI recognitionsite, and a variable region. The variable region may comprise either afully-randomized sequence (e.g. all possible hexamers) or a selectedpool of sequences (e.g. variations on a particular protein-binding, orother, functional sequence motif). If the variable sequence is random,the length of the randomized sequence determines the sequence complexityof the pool. For example, randomization of a hexameric sequence at the3′ ends of the primers results in a pool comprising 4,096 distinctsequence combinations. Examples of two such mixed populations ofoligonucleotides (in this case, 32-mers) are primer pools 1s and 2s,below: primer 1 (a pool of 4096 32-mers):5′ gcagcagtacgactagcataTCCGACnnnnnn 3′ [SEQ ID NO: 4] primer 2 (a poolof 4096 32-mers): 5′ cgatagcagtagcatgcaggTCCGACnnnnnn 3′ [SEQ ID NO: 5]

[0225] A nucleic acid preparation is amplified, using primer 1 torandomly prime synthesis of sequences present therein. The startingnucleic acid molecules are cDNA or genomic DNA, either of which maycomprise molecules that are substantially whole or that are into smallerpieces. Many DNA cleavage methods are well known in the art. Mechanicalcleavage is achieved by several methods, including sonication, repeatedpassage through a hypodermic needle, boiling or repeated rounds of rapidfreezing and thawing. Chemical cleavage is achieved by means whichinclude, but are not limited to, acid or base hydrolysis, or cleavage bybase-specific cleaving substances, such as are used in DNA sequencing(Maxam and Gilbert, 1977, Proc. Natl. Acad. Sci. U.S.A., 74: 560-564).Alternatively, enzymatic cleavage that is site-specific, such as ismediated by restriction endonucleases, or more general, such as ismediated by exo- and endonucleases e.g. ExoIII, mung bean nuclease,DNAase I or, under specific buffer conditions, DNA polymerases (such asT4), which chew back or internally cleave DNA in a proofreadingcapacity, is performed. If the starting nucleic acid molecules (whichmay, additionally, comprise RNA) are fragmented rather than whole(whether closed circular or chromosomal), so as to have free ends towhich a second sequence may be attached by means other than primedsynthesis, the MmeI recognition sites may be linked to the startingmolecules using DNA ligase, RNA ligase or terminal deoxynucleotidetransferase. Reaction conditions for these enzymes are as recommended bythe manufacturer (e.g. New England Biolabs; Beverly, Mass. or BoehringerMannheim Biochemicals, Indianapolis, Ind.). If employed, PCR isperformed using template DNA (at least 1 fg; more usefully, 1-1,000 ng)and at least 25 pmol of oligonucleotide primers; an upper limit onprimer concentration is set by aggregation at about 10 μg/ml. A typicalreaction mixture includes: 2 μl of DNA, 25 pmol of oligonucleotideprimer, 2.5 μl of 10× PCR buffer 1 (Perkin-Elmer, Foster City, Calif.),0.4 μl of 1.25 μM dNTP, 0.15 μl (or 2.5 units) of Taq DNA polymerase(Perkin Elmer, Foster City, Calif.) and deionized water to a totalvolume of 25 μl. Mineral oil is overlaid and the PCR is performed usinga programmable thermal cycler. The length and temperature of each stepof a PCR cycle, as well as the number of cycles, is adjusted inaccordance to the stringency requirements in effect. Initialdenaturation of the template molecules normally occurs at between 92° C.and 99° C. for 4 minutes, followed by 20-40 cycles consisting ofdenaturation (94-99° C. for 15 seconds to 1 minute), annealing(temperature determined as discussed below, 1-2 minutes), and extension(72° C. for 1 minute). Final extension is generally for 4 minutes at 72°C., and may be followed by an indefinite (0-24 hour) step at 4° C.

[0226] Annealing temperature and timing are determined both by theefficiency with which a primer is expected to anneal to a template andthe degree of mismatch that is to be tolerated. In attempting to amplifya mixed population of molecules, the potential loss of molecules havingtarget sequences with low melting temperatures under stringent(high-temperature) annealing conditions against the promiscuousannealing of primers to sequences other than their target sequence isweighed. The ability to judge the limits of tolerance for feature lossvs. the inclusion of artifactual amplification products is within theknowledge of one of skill in the art. An annealing temperature ofbetween 30° C. and 65° C. is used. An example of one primer out of thepool of 4096 primer 1, one primer (primer 1ex) is shown below, as is aDNA sequence from the preparation with which primer lex has high 3′ endcomplementarity at a random position. The priming site is underlined oneither nucleic acid molecule. primer 1ex: 5′-gcagcagtacgactagcataTCCGACctgcgt t-3′: [SEQ ID NO: 7; bases 1-32] genomic DNA:3′-tttcgacgcacatcgcgtgcatggccccatgcatcagg [SEQ ID NO: 6]ctgacgaccgtcgtacgtctactcggct-5′:

[0227] After priming, polymerase extension of primer 1ex on the templateresults in:5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtaccggggtacgtagtcc [SEQ IDNO: 7] gactgctggcagcatgcagatgagccga-3′

[0228] Out of the pool of 4096 primer 2, one primer with high 3′ endcomplementarity to a random position in the extended primer 1ex DNA isselected by a polymerase for priming (priming site in bold):5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtaccggggtacgtagtcc [SEQ IDNO: 7] gactgctggcagcatgcagatgagccga 3′ primer 2ex:3′-gacgacCAGCCTggacgtacgatgacgatagc-5′: [SEQ ID NO: 8; bases 1-32]

[0229] After priming and synthesis, the resulting second strand is:3′-cgtcgtcatgctgatcgtatAGGCTGgacgcacatcgcgtgcatggccccatgcatcagg [SEQ IDNO: 8] ctgacgacCAGCCTggacgtacgatgacgatagc-5′

[0230] Primer 3, shown below, is a 26-mer that is identical to theconstant region of primer 1 ex: [SEQ ID NO: 7; nucleotides 1-26]5′-gcagcagtacgactagcataTCCGAC-3′ It is immobilized by a 5′ acrylyl groupto a polyacrylamide layer on a glass slide.

[0231] Primer 4, below, is a 26-mer that is complementary to theconstant region of primer 2ex: [SEQ ID NO: 8; nueleotides 1-26]5′-cgatagcagtagcatgcaggTCCGAC-3′ It is optionally immobilized to thepolyacrylamide layer by a 5′ acrylyl group.

[0232] The pool of amplified molecules derived from the sequentialprirning of the original nucleic acid preparation with mixed primers 1and 2, including the product of 1ex/2ex priming and extension, arehybridized to immobilized primers 3 and 4. In situ PCR is performed asdescribed above, resulting in the production of a first random,immobilized array of nucleic acid molecules according to the invention.This array is replicated by the methods described in Example 1 in orderto create a plurality of such arrays according to the invention.

[0233] After in situ PCR using primers 3 and 4:5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtaccggggtacgtagt [SEQ IDNO: 9] 3′-cgtcgtcatgctgatcgtatAGGCTGgacgcacatcgcgtgcatggccccatgcatca[SEQ ID NO: 8] ccgactgctgGTCGGAcctgcatgctactgctatcg-3′ggctgacgacCAGCCTggacgtacgatgacgatagc-5′

[0234] After cutting with MmeI and removal of the non-immobilizedstrands: 5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtacc-3′ [SEQ IDNO: 9; bases 1-46] (primer 1-based, clone-specific oligonucleotide)3′-ccatgcatcaggctgacgacCAGCCTggacgtacgatgacgatagc-5′ [SEQ ID NO: 8;bases 1-46] (primer 2-based, clone-specific oligonucleotide)

[0235] The resulting random arrays of oligonucleotide primersrepresenting the nucleic acid sequences of the original preparation areuseful in several ways. Any particular feature, such as the above pairof primers, is used selectively to amplify the intervening sequence (inthis case two central bp of the original 42 bp cloned segment arecaptured for each use of the chip or a replica) from a second nucleicacid sample. This is performed in solution or in situ, as describedabove, following feature identification on the array, using free,synthetic primers. If desired, allele-specific primer extension orsubsequent hybridization is performed.

[0236] Importantly, this technique provides a means of obtainingcorresponding, or homologous, nucleic acid arrays from a second cellline, tissue, organism or species according to the invention. Theability to compare corresponding genetic sequences derived fromdifferent sources is useful in many experimental and clinicalsituations. By “corresponding genetic sequences,” one means the nucleicacid content of different tissues of a single organism or tissue-culturecell lines. Such sequences are compared in order to study the cell-typespecificity of gene regulation or mRNA processing or to observechromosomal rearrangements that might arise in one tissue rather thananother. Alternatively, the term refers to nucleic acid samples drawnfrom different individuals, in which case a given gene or its regulationis compared between or among samples. Such a comparison is of use inlinkage studies designed to determine the genetic basis of disease, inforensic techniques and in population genetic studies. Lastly, it refersto the characterization and comparison of a particular nucleic acidsequence in a first organism and its homologues in one or more otherorganisms that are separated evolutionarily from it by varying lengthsof time in order to highlight important (therefore, conserved)sequences, estimate the rate of evolution and/or establish phylogeneticrelationships among species. The invention provides a method ofgenerating a plurality of immobilized nucleic acid arrays, wherein eacharray of the plurality contains copies of nucleic acid molecules from adifferent tissue, individual organism or species of organism.

[0237] Alternatively, a first array of oligonucleotide primers withsequences unique to members of a given nucleic acid preparation isprepared by means other than the primed synthesis described above. To dothis, a nucleic acid sample is obtained from a first tissue, cell line,individual or species and cloned into a plasmid or other replicablevector which comprises, on either side of the cloning site, a type IISenzyme recognition site sufficiently close to the junction betweenvector and insert that cleavage with the type IIS enzyme(s) recognizingeither site occurs within the insert sequences, at least 6 to 10,preferably 10 to 20, base pairs away from the junction site. It iscontemplated that type IIS restriction endonuclease activity may evenoccur at a distance of up to 30 pairs from the junction site. Thenucleic acid molecules are cleaved from the vector using restrictionenzymes that cut outside of both the primer and oligonucleotidesequences, and are then immobilized on a semi-solid support according tothe invention by any of the methods described above in which covalentlinkage of molecules to the support occurs at their 5′ termini, but doesnot occur at internal bases. Cleavage with the type IIS enzyme (such asMmeI) to yield the immobilized, sequence-specific oligonucleotides isperformed as described above in this Example.

[0238] As mentioned above, it is not necessary to immobilize primer 4 onthe support. If primer 4 is left free, the in situ PCR products yieldthe upper (primer 1 derived) strand upon denaturation:5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtaccggggtacgtagtcc [SEQ IDNO: 9] gactgctgGTCGGAcctgcatgctactgctatcg-3′.

[0239] This sequence is available for hybridization tofluorescently-labeled DNA or RNA for mRNA quantitation or genotyping.

EXAMPLE 7

[0240] Isolation of a Feature from an Array of the Invention (Method 2)

[0241] As described above, laser-capture microdissection is performed inorder to help orient a worker using the arrays of a set of arraysproduced according to the invention, or to remove undesirable featuresfrom them. Alternatively, this procedure is employed to facilitate thecloning of selected features of the array that are of interest. Thetransfer of the nucleic acid molecules of a given feature or group offeatures from the array to a thin film of EVA or another heat-sensitiveadhesive substance is performed as described above. Following thosesteps, the molecules are amplified and cloned as follows:

[0242] The transfer film and adherent cells are immediately resuspendedin 40 μl of 10 mM Tris·HCl (pH 8.0), 1 mM EDTA and 1% Tween-20, andincubated overnight at 37° C. in a test tube, e.g. a polypropylenemicrocentrifuge tube. The mixture is then boiled for 10 minutes. Thetubes are briefly spun (1000 rpm, 1 min.) to remove the film, and 0.5 μlof the supernatant is used for PCR. Typically, the sheets of transferfilm initially applied to the array are small circular disks (diameter0.5 cm). For more efficient elution of the after LCM transfer, the diskis placed into a well in a 96-well microliter plate containing 40 μl ofextraction buffer. Oligonucleotide primers specific for the sequence ofinterest may be designed and prepared by any of the methods describedabove. PCR is then performed according to standard methods, as describedin the above examples.

EXAMPLE 8

[0243] Excluded Volume Protecting Groups

[0244] The density of features of the arrays is limited in that theymust be sufficiently separated to avoid contamination of adjacentfeatures during repeated rounds of amplification and replication. Thisis achieved using dilute concentrations of nucleic acid pools, butresults in density limited by the Poisson distribution to a maximum of37% occupancy of available appropriately spaced sites. In order toincrease the density of features while maintaining the spacing necessaryto avoid cross contamination, the following approach may be taken.

[0245] An activity which can bind the nucleic acid molecules of the poolis positioned in spots on the surface of the array support to create acapture array. The spots of the capture array are arranged such thatthey are separated by a distance greater than the size of the spots(this is typically near the resolution of the intended detection andimaging devices, or approximately 3 microns). The size of the spots isset to be less than the diameter of the excluded volume of the nucleicacid polymer to be captured (for example, approximately one micron for50 kb lambda DNA in 10 mM NaCl; please see Rybenkov et al., 1993, Proc.Natl. Acad. Sci. U.S.A. 90: 5307-5311, Zimmerman & Trach, 1991, J. Mol.Biol. 222: 599-620, and Sobel & Harpst, 1991, Biopolymers 31: 1559-1564,incorporated herein by reference, for methods of predicting excludedvolumes of nucleic acids.

[0246] The “nucleic acid capture activity” of the array may be ahydrophilic compound, a compound which reacts covalently with thenucleic acid polymers of the pool, an oligonucleotide complementary to asequence shared by all members of a pool (e.g., an oligonucleotidecomplementary to the 12 bp cohesive ends of a phage λ library, oroligonucleotide(s) complementary to one or both ends of a PCR-generatedlibrary containing large inserts and 6 to 50 bp of one strand exposed atone or both ends) or some other capture ligand including but not limitedto proteins, peptides, intercalators, biotin, avidin, antibodies orfragments of antibodies or the like.

[0247] An ordered array of nucleic acid capture ligand spots may be madeusing a commercially-available micro-array synthesizer, modified inkjetprinter (Castellino, 1997, supra), or the methods disclosed by Fodor etal. (U.S. Pat. No. 5,510,270), Lockhart et al. (U.S. Pat. No. 5,556,752)and Chetverin and Kramer (WO 93/17126). Alternatively, details on thedesign, construction and use of a micro-array synthesizer are availableon the World Wide Web at www.cmgm. stanford.edu/pbrown.

[0248] An excess of nucleic acid or DNA is then applied to the surfaceof the microfabricated capture array. Each spot has multiple chances tobind a free nucleic acid molecule. However, once a spot has bound anucleic acid molecule, it is protected from binding other molecules,i.e., the excluded volume of the bound DNA protects the spot frombinding more than one molecule from the pool. Thus, saturation binding,or a situation very close to it, may be achieved while retaining theoptimal spacing for subsequent amplification and replication.

[0249] The array resulting from this process may be amplified in situand replicated according to methods described herein. Alternatively, orin addition, the array may be treated in a way which decreases theexcluded volume of the captured group so that additional rounds ofexcluded volume protecting group (EVPG) addition may be performed.Arrays produced in this manner not only increase the efficiency of thearray beyond that normally allowed by the Poisson distribution, but alsocan be of predetermined geometry and/or aligned with othermicrofabricated features. In addition, such arrays allow complicatedhighly parallel enzymatic or chemical syntheses to be performed on largeDNA arrays.

EXAMPLE 9

[0250] Replica-destructive Amplification Methods

[0251] A major advantage of the replica amplification method is thatbecause there are multiple copies of a particular array, information isnot lost if a given replica is destroyed or rendered non-re-usable by aprocess. This allows the use of the most sensitive detection methods,regardless of their impact on the subsequent usefulness of thatparticular replica of the array. For example, tyramide-biotin/HRP (orother enzymatic in situ reactions) or biotin/avidin or antibody/haptencomplexes (or other ligand sandwiches) may be used to effectivelyamplify the signal in a nucleic acid hybridization (or other bimolecularbinding) experiment. These methods, however, may be considereddestructive to the DNA array in that they involve interactions which arekinetically difficult to disrupt without destroying the array.Similarly, some detection processes, including sequencing by ligationand restriction and the variant methods described herein (see Examples11 and 12), necessarily involve destruction, either chemically orenzymatically or both, of the template array. The availability ofreplica arrays made according to the methods disclosed herein allow theuse of these methods, as they destroy only the replica, not the originalor other copies.

[0252] The availability of replicas of an array allows the use of directfluorescent detection of probes hybridized to the array without loss ofthe array for subsequent uses. One method which this allows is therelative quantitation of mRNA by hybridization of the array withfluorescently labeled total cDNA probes. This method allows theevaluation of changes in the expression of a wide array of genes inpopulations of RNA isolated from cells or tissues in different growthstates or following treatment with various stimuli.

[0253] Fluorescently labeled cDNA probes are prepared according to themethods described by DeRisi et al., 1997, Science 278: 680-686 and byLockhart et al., 1996, Nature Biotechnol. 14: 1675-1680. Briefly, eachtotal RNA (or mRNA) population is reverse transcribed from an oligo-dTprimer in the presence of a nucleoside triphosphate labeled with aspectrally distinguishable fluorescent moiety. For example, onepopulation is reverse transcribed in the presence of Cy3-dUTP (greenfluorescence signal), and another reverse transcribed in the presence ofCy5-dUTP (red fluorescence signal).

[0254] Hybridization conditions are as described by DeRisi et al. (1997,supra) and Lockhart et al. (1996, supra). Briefly, final probe volumeshould be 10-12 μl, at 4× SSC, and contain non-specific competitors(e.g., poly dA, C₀T1 DNA for a human cDNA array) as required. To thismixture is added 0.2 μl of 10% SDS and the probes are boiled for twominutes and quick chilled for ten seconds. The denatured probes arepipetted onto the array and covered with a 22mm×22 mm cover slip. Theslide bearing the array is placed in a humid hybridization chamber whichis then immersed in a water bath (62° C.) and incubated for 2-24 hours.Following incubation, slides are washed in solution containing 0.2× SSC,0.1% SDS and then in 0.2× SSC without SDS. After washing, excess liquidis removed by centrifugation in a slide rack on microtiter platecarriers. The hybridized arrays are then immediately ready for scanningwith a fluorescent scanning confocal microscope. Such microscopes arecommercially available; details concerning design and construction of ascanner are also available on the World Wide Web at www.cmgm.stanford.edu/pbrown.

[0255] In the above example in which one population of RNA wasreverse-transcription labeled with Cy3 and the other with Cy5fluorescent dyes, the relative expression of genes represented by thefeatures of the micro-array may be evaluated by the presence of green(Cy3, indicating the mRNA from this population hybridizes to a givenfeature), red (Cy5, indicating the mRNA from this population hybridizesto a given feature) or yellow (indicating that both mRNA populationsused to make probes contain mRNAs which hybridize to a given feature)fluorescent signals.

[0256] Alternatively, separate replicas of the same array may behybridized separately with probes labeled with the same fluorescent dyemarker but made from different populations of mRNA. For example, cDNAprobes made from cells before and after treatment with a growth factormay be hybridized with separate replicas of a genomic array made fromthose cells. The intensity of the signal of each feature may be comparedbefore and after growth factor treatment to yield a representation ofgenes induced, repressed, or whose expression is unaffected by thegrowth factor treatment. This method requires that the replica arrayscontain one or more markers which will not vary as a means of aligningthe hybridized arrays. Such a marker may be a foreign or synthetic DNA,for example. The RNA corresponding to such a marker is spiked at equalconcentration into the reverse transcription reactions used to generatelabeled cDNA probes. Prior to the first hybridization with experimentalcDNAs, a control hybridization using only the marker cDNA may beperformed on a replica array to precisely detennine the position(s) ofthe marker(s) within the array.

[0257] In either the simultaneous hybridization or the separatehybridization methods, the availability of additional replicas of thearray allows further characterization (including but not limited tosequencing and isolation of the gene represented by the feature) ofthose features of the array which exhibit particular expressionpatterns.

EXAMPLE 10

[0258] Geometrical Focusing

[0259] A characteristic of the replica amplification process is thateach replica will tend to occupy a larger area than the feature fromwhich it was made. This is because the feature molecules transferred tothe replica may come from anywhere within the circumferential areaoccupied by the template feature. Subsequent amplification of thetransferred molecules will necessarily increase the area occupied by thefeature relative to that occupied by the template feature. It is clearthat this phenomenon will limit the practical number of times an arraymay be sequentially replicated without contamination of surroundingfeatures. There are several approaches to solving this problem.

[0260] First, as mentioned previously, more than one replica of anamplified array may be made per amplification. It is clear that the“earlier” in the replication process a given array is replicated, theless area its features will occupy relative to those made later. Thatis, the more replicas one can make of an original amplified array beforere-amplifying the template, the more arrays with smaller features onewill have. The number of replicas of a given array which may be madewithout re-amplification of the template may be determined empiricallyby, for example, hybridization of a sequential series of amplifiedreplicas from a single array with an oligonucleotide which hybridizeswith a sequence common to every feature. Comparison of the hybridizationsignals from the first replica to those of subsequent replicas made fromthe same template without re-amplification of the template will indicateat what point features begin to be lost from the replicas.

[0261] Second, one may reduce the number of PCR cycles used in theamplification process. Because the amplification is exponential, a smallchange in the cycle number can have a profound influence on the areaoccupied by the feature. This will clearly not solve the problemcompletely, but when combined with the first approach it can extend theuseful number of cycles of amplification and replication for a givenarray. The practical number of PCR cycles to use for each round ofamplification may also be estimated empirically by making severalreplicas from a single template array without re-amplification, and thensubjecting individual replicas in the series to increasing numbers ofPCR cycles. For example, replicas may be subjected to 10, 20, and 30amplification cycles, followed by hybridization with a fluorescent probesequence common to all features of the array. Visualization of thehybridized array by fluorescence microscopy will indicate at which pointthe features begin to intrude upon one another. Clearly, the startingsize of the feature will influence the number of PCR cycles allowableper replication cycle, but it is within the ability of one skilled inthe art to determine generally how many cycles are optimal to obtainenough DNA for subsequent rounds of replica amplification withoutwidespread contamination of surrounding features.

[0262] A third approach recognizes the fact that the amplified featuresoccupy more than just the two dimensional area of the surface they situpon. Rather, each amplified feature occupies a hemispherical space witha radius, r. If the features are situated on one slide, which fordiscussion will be designated the “bottom” slide, and covered by anotherslide (the “top” slide) set at a uniform, fixed distance from the bottomslide, one will note that as the hemispherical feature expands withrounds of amplification, the portion of the growing hemisphere whichfirst contacts the top slide will be much smaller in cross-sectionalarea than the portion in contact with the bottom slide. This presents asmaller surface area, with all sequence information intact, from whichto make replicas that do not occupy greater surface area than theirtemplate features. This method will be referred to as “geometricalfocusing.”

[0263] For example, after 30 cycles in 15% polyacrylamide, 500 bpamplicons will form hemispheres with a 10 micron radius. The length ofthe template and the percentage of acrylamide in the gel influence thesize of the amplified features such that, for a given number of cycles,the size of the features decreases as the length of the template or thepercentage of acrylamide increases. In general, the size of an amplifiedfeature with respect to a given number of amplification cycles undergiven conditions is determined empirically by visualizing it with afluorescent confocal microscope or fluorimager after staining with afluorescent intercalator. Labeled primers or nucleotides may also beused to “light up” the feature for measurement by this method.

[0264] The distance between the surface bearing the array and thesurface the array is to be transferred to may be controlled usingplastic spacers of the desired thickness along the edges of the slide. Asmall volume of polyacrylamide solution plus capillary action will takethe volume out to the edges of a predetermined area of coverslip.

[0265] Another contemplated method of regulating or controlling thedistance between surfaces in the geometrical focusing method involvesthe use of optical feedback, such as Newton rings or otherinterferometry, to adjust pressure locally across the surfaces. Theadjustment may be accomplished by a scanning laser that heats adifferential thermal expansion plate differentially based on the opticalfeedback.

[0266] As mentioned above, bioactive substances such as enzymes may becast directly in polyacrylamide gels. Other reagents, including buffersand oligonucleotide primers may be either cast into the gels or added bydiffusion or even electrophoretic pulses to the pre-formed gel matrices.If the upper plate has little or no adhesiveness to the gel (achieved,for example, through silane coating as described above), then when it isremoved, the upper circle of each hemisphere is the only exposed DNA.Some of the exposed DNA can be transferred by microcontact printingusing either plate, or by another round of polymerization from the upperplate. The radius of the circle exposed for transfer will bec=sqrt(r²−d²), where r is the radius of the hemisphere and d is thedistance between the plates. Therefore, when r=10 microns and d=8microns, the radius of the exposed circle, c=6 microns, less than thesize of the template feature. This exposed circle will thus have across-sectional area less than that occupied by the template feature,referred to as q, at the surface of the support. This slight reductionin the radius, and consequently the cross-sectional area of thetransferred feature will work to keep the amplified replica featuressharper through several rounds of replication. The distance between theplates may be 10%, 20%, 30%, 40%, on up to 50% or more less than theradius of the features being transferred. The surface area (of thesupport) occupied by the transferred features may be considered reducedor lessened if it is 10%, 20%, 30%, 40%, on up to approximately 80% lessthan the area occupied by features on the template array. The resolutionof the features is considered to be preserved if the features remainessentially distinct after amplification of the transferred nucleicacid. It is noted that features which amplify with lower efficiency thanothers may be lost if the distance between plates is too large.Therefore, geometrical focusing will be most useful when combined withthe other two approaches described for limiting the size of amplifiedreplicas. That is, the number of replicas made from individual arraysearly in the process should be maximized while the number of PCR cyclesper amplification should be minimized.

EXAMPLE 11

[0267] Replica Sequencing with Ligation/Restriction Cycles

[0268] The sequencing by ligation and restriction method of Brenner, asdescribed above, provides a powerful approach to the simultaneoussequencing of entire arrays of DNA molecules. The ability to replicatethe entire array provides a novel approach to improving the efficiencyof the sequencing method. In its standard format, the number of basessequenced by the ligation and restriction method is limited by abackground of molecules which fail to ligate or cleave properly in agiven cycle. This phenomenon disturbs the synchrony of the process andlimits the effective lengths which may be sequenced by this method sincethe interference it introduces is cumulative.

[0269] The sequencing by ligation and restriction method as disclosed byBrenner addresses this issue by the optional inclusion of a “capping”step after the unligated probe has been removed. According to thatmethod, when the target molecules have a 5′ protruding end, a mixture ofdideoxynucleoside triphosphates and a DNA polymerase is added prior tothe next cleavage step. This results in the addition of a singledideoxynucleotide to the 3′ terminus of the recessed strand which willprevent subsequent ligation steps, effectively deleting the moleculewhich failed to be ligated from the target population. The effectivenessof the capping method is dependent on the completeness of the capaddition.

[0270] An improvement on the method of sequencing by ligation andcleavage involves the use of two or more distinct probes comprisingdifferent “ligation cassettes” coupled with a round of replicaamplification by PCR wherein one of the primers is specific to the mostrecently added ligation cassette. This method will be referred to as“replica sequencing with ligation and restriction cycles.” A probe ofuse in this method is a double-stranded polynucleotide which (i)contains a recognition site for a nuclease, (ii) typically has aprotruding strand capable of forming a duplex with a complementaryprotruding strand of the target polynucleotide, and (iii) which has asequence, the “ligation cassette,” such that an oligonucleotide primercomplementary to one such sequence or cassette will allow amplificationof the molecule to which it is ligated under the conditions used forannealing and extension within the method.

[0271] In each sequencing cycle, only those probes whose protrudingstrands form perfectly-matched duplexes with the protruding strand ofthe target polynucleotide hybridize and are then ligated to the end ofthe target polynucleotide. The probe molecules are divided into fourpopulations, wherein each such population comprises one of the fourpossible nucleotides at the position to be determined, each labeled witha distinct fluorescent dye. The remaining positions of theduplex-forming region are occupied with randomized, unlabeled bases, sothat every possible multimer the length of that region is represented;therefore, a certain percentage of probe molecules in each pool arecomplementary to the single-stranded region of the targetpolynucleotide; however, only one pool bears labeled probe moleculesthat will hybridize.

[0272] The individual probes comprising different ligation cassettes mayhave a recognition sequence for the same or different type IIsrestriction endonuclease. The important factor is that the ligationcassette sequences, due to their distinct primer bindingcharacteristics, allow amplification of only those target moleculeswhich were successfully ligated in the previous ligation step. This alsoenforces the requirement for completing the cleavage step, as thosetarget molecules which were not cleaved in the previous step willsimilarly not be amplified, since they will not bear the proper primer.This process enriches the proportion of each feature which hassuccessfully completed the most recent cycle of ligation andrestriction. Through the reduction in background due to improvedsynchrony, this method increases the number of bases which can besequenced for features on a given array. The added steps of thereplication and subsequent re-amplification of the array not onlyfurther enrich for sequences which are in synchrony, but also conferscontrol over the size of the features, as described herein in thesection entitled “Geometrical Focusing.” As discussed in that section,control over the size of the features with increasing numbers ofamplification or replication cycles allows more sequence or otherinformation to be gleaned from a given array before features begin tooverlap.

[0273] After a cycle of cleavage, ligation of a first ligation cassette,and subsequent detection of the next base in the sequence, the steps onewill perform in applying the replica amplification process to thismethod of sequencing are as follows: 1) using primers, one complementaryto the common end (arbitrarily designated the 5′ end, for thisdiscussion) of the features being sequenced, and the other complementaryto the most recently added ligation cassette, the features of the arrayare amplified and then replicated according to methods described hereinabove; 2) a replica is then subjected to a new cycle of cleavage,ligation of a probe comprising a distinct ligation cassette, anddetection of the next base in the sequence; 3) the features of the arrayare amplified using the primer complementary to the common 5′ end of thefeatures and a primer complementary to the distinct ligation cassette,followed by replication of the array; and 4) the process of steps 1-3 isrepeated until the sequences of the features are determined.

[0274] Within the method of replica sequencing with ligation andrestriction cycles, a new probe comprising a distinct ligation cassettesequence may be used for each cycle of ligation and restriction.Alternatively, fewer different ligation cassettes than the number ofcycles of ligation and restriction may be used. In other words, as fewas two and as many as n (where n equals the number of cycles of ligationand restriction) different ligation cassettes may be of use according tothe method. As used herein, “new” or “different” or “distinct” whenreferring to probes or ligation cassettes comprised by probes is meantto indicate that the sequence of each ligation cassette, or theoligonucleotide probe comprising it, is such that a primer complementaryto the ligation cassette will not hybridize with any other cassette oroligonucleotide comprising a cassette under the conditions used forannealing and polymerization. Clearly, the greater the number ofdifferent ligation cassettes used, the more strictly the requirement forcompletion of previous cycles will be enforced. It is within the abilityof one of skill in the art to determine how many different ligationcassettes are required to achieve a desired level of synchrony (with aconcomitant reduction in background). As a general guideline, since thebackground due to incomplete cycles is cumulative, the number ofligation cassettes will vary in proportion to the desired number ofbases to be sequenced. One would, for example, expect to use a largernumber of different ligation cassettes if 300 bases are to be sequencedthan one would use to sequence 30 bases.

[0275] Replication of the arrays in the method of replica sequencing byligation and restriction may be performed as often as every cycle, onceevery nth cycle (where n is greater than 1), or even once per whole setof cycles. Again, the frequency of replication may be determined by oneskilled in the art. Considerations include, but are not limited to thephysical size of the features and the overall desired number of bases tobe sequenced.

[0276] The method of Jones, 1997, Biotechniques 22: 938-946 teaches theuse of PCR amplification to positively select for those molecules in apopulation which had successfully completed the previous cycle ofcleavage and ligation. Jones did not, however, teach the replication ofamplified populations or the application of the method to random arraysof features. Rather, Jones taught the use of microwell plates and arobotic pipetting apparatus to perform his method. An importantadvantage of the incorporation of the replication step into thesequencing method is that it allows control over the size of theamplified features. While Jones mentions the eventual application of hismethod to the “biochip” format, no guidance is given which would allowone to overcome the inherent limitation on the size of the features in amethod incorporating PCR amplification steps on a microarray. Incontrast, novel methods based on the replication of arrays, such asgeometrical focusing, are described herein which overcome thislimitation.

EXAMPLE 12

[0277] Non-Replica Sequencing

[0278] Methods allowing determination of DNA sequences on an array thatdo not involve replica production are also preferred for someapplications. For example, sequencing of transcription products (ortheir reverse transcripts) in situ requires that the fine resolution ofthe sequencing templates be preserved.

[0279] One may use the method of Jones (1997, supra) to sequencefeatures on an array without replicating the array. Othernon-electrophoretic methods which might be adapted to sequencing ofmicroarrays include the single nucleotide addition methods ofminisequencing (Canard & Sarfati, 1994, Gene 148: 1-6; Shoemaker et al.,1996, Nature Genet. 14: 450-456; Pastinen et al., 1997, Genome Res. 7:606-614; Tully et al., 1996, Genomics 34: 107-113; Jalanko et al., 1992,Clin. Chem. 38: 39-43; Paunio et al., 1996, Clin. Chem. 42: 1382-1390;Metzker et al., 1994, Nucl. Acids Res. 22: 4259-4267) and pyrosequencing(Uhlen & Lundeberg, U.S. Pat. No. 5,534,424; Ronaghi et al., 1998,Science 281: 363-365; Ronaghi et al., 1999, Anal. Biochem. 267: 65-71).

[0280] As an alternative to minisequencing or pyrosequencing, the novelmethod of fluorescent in situ sequencing extension quantification(FISSEQ) may be used. FISSEQ involves the following steps: 1) a mixtureof primer, buffer and polymerase are added to a microarray of singlestranded DNA; 2) a single, fluorescently labeled base is added to themixture, and will be incorporated if it is complementary to thecorresponding base on the template strand; 3) unincorporated dNTP iswashed away; 4) incorporated dNTP is detected by monitoringfluorescence; 5) steps 2-4 are repeated (using fresh buffer andpolymerase) with each of the four dNTPs in turn; and 6) steps 2-5 arerepeated in cycles until the sequence is known.

[0281] The method of sequencing nucleic acid molecules within apolyacrylamide gel matrix using the Fluorescent In Situ SequencingExtension Quantification method and nucleotides labeled with cleavablelinkers was demonstrated in the-following experiments.

[0282] In order to evaluate the method, molecules of a known DNAsequence were first cast into a polyacrylamide gel matrix. Theoligonucleotide sequencing primer RMGP1-R (5′-gcc cgg tct cga gcg tctgtt ta) was annealed to the oligonucleotide puc514c (Q-5′ teggccaacgcgcggg gagaggcggt ttgcgtatca g taaacagac gctcgagacc gggc (sample 1))or to the oligonucleotide puc234t (Q-5′ cccagt cacgacgttg taaaacgacggccagtgtcg a taaacagac gctcgagacc gggc (sample 2). The bolded sequencesdenote the sequences to which the sequencing primer anneals, and Qindicates an ACRYDITE modification.

[0283] Equal arnounts of template and primer were annealed at a finalconcentration of 5 μM in 1× EcoPol buffer (10 mM Tris pH 7.5, 5 mMMgCl₂), by heating to 95 degrees C. for 1 minute, slowly cooling to 50degrees C. at a rate of 0.1 degrees per second, and holding the reactionat 50 degrees C. for 5 minutes. The primer:template complex was thendiluted by adding 30 μl 1× Ecopol buffer and 2 μl 500 mM EDTA.

[0284] One microliter of each annealed oligonucleotide was added to 17μl of acrylamide gel mixture (40 mM Tris pH 7.3, 25% glycerol, 1 mM DTT,6% acrylamide (5% cross-linking), 17.4 units SEQUENASE version 2.0(United States Biochemical, USB), 15 μg/ml E. coli single strandedbinding protein (USB), 0.1 mg/ml BSA). Then, 1 μl of 1.660% TEMED and 1μl of 1.66% APS were added and 0.2 μl of each mixture was pipetted ontobind-silane treated glass microscope slides. The slides were immediatelyput under an argon bed for 30 minutes to allow polymerization of theacrylamide.

[0285] The slides containing the spots of polyacrylamide containing DNAmolecules to be sequenced were then washed in 40 mM Tris pH 7.5, 0.01%Triton X-100 for 30 seconds, after which the slides were ready forsequencing reactions. Each slide was subjected to a number of singlenucleotide extension cycles (in the nomenclature adopted for thepurposes of this example, a single nucleotide extension cycle means theaddition of one nucleotide, not the sequential addition of each of thefour nucleotides G, A, T, and C). For each cycle, the slide wasincubated in extension buffer with one nucleotide for 4 minutes at roomtemperature. Between cycles, the slides were washed twice for minuteseach in FISSEQ wash buffer (10 mM Tris pH 7.5, 250 mM NaCl, 2 mM EDTA,0.01% Triton X-100), and spun briefly to dry. Slides were scanned on aGSI SCANARRAY 4000 fluorescence scanner.

[0286] In the first cycle, each slide was incubated in dATP extensionmix (10 mM Tris pH 7.5 50 mM NaCl, 5 mM MgCl₂, 0.1 mg/ml BSA, 0.01%Triton X-100, 0.2 μM unlabeled dATP). In the next cycle each slide wasincubated in the dCTP extension mix (as above, with dCTP replacingdATP). In all, Slide 1 was subjected to 5 cycles of unlabeled nucleotideaddition (i.e., A, then C, then G, then T, then A), followed by 1 cycleof fluorescently labeled dCTP addition (10 mM Tris pH 7.5 50 mM NaCl, 5mM MgCl₂, 0.1 mg/ml BSA, 0.01% Triton X-100, 0.2 μM unlabeled dCTP, 0.2μM Cy3-dCTP).

[0287]FIG. 1 shows a fluorescence scan of slide 1 after the cycle inwhich the labeled dCTP was added, above a schematic of the sequencingtemplates indicating the expected extension products for each template.Fluorescent label was detected in spots containing sample 1, where thesixth template nucleotide is a G, which allows the addition of thelabeled C to the primer. No label was detected in spots containingsample 2, which agrees with the fact that the next template nucleotidewas a T, which did not allow incorporation of the labeled C onto theprimer. These data indicate that sequencing reactions in polyacrylamidespots remain in phase after 6 additions, and that misincorportion by thepolymerase is not high under these conditions.

[0288] A second slide, slide 2, was subjected to 7 cycles of unlabelednucleotide addition (i.e., A, then C, then G, then T, then A, then C,then G), followed by 1 cycle of Cy5-dUTP addition (10 mM Tris pH 7.5 50mM NaCl, 5 mM MgCl₂, 0.1 mg/ml BSA, 0.01% Triton X-100, 0.2 μM unlabeleddTTP, 0.2 μM Cy5-dUTP). FIG. 2 shows a scan of slide 2 after theCy5-dUTP addition, and a schematic of the expected extension products.Since both nucleic acid sequencing template samples 1 and 2 encoded an Aas the next base to be added to the primer, no signal is detected inspots containing either sample template. This confirms that thesequences were maintained in phase through 6 additions, and furtherindicates a lack of misincorporation by the polymerase under theseconditions.

[0289] Slide 3 was subjected to 9 cycles of unlabeled nucleotideaddition (A, then C, then G, then, T, then A, then C, then G, then T,then A) followed by 1 cycle of Cy3-dCTP addition. The fluorescence scanof slide 3 is shown in FIG. 3. Fluorescently labeled C was correctlyadded to the primer on sample 1, but was not added to the primer onsample 2.

[0290] Finally, slide 4 was subjected to 11 cycles of unlabelednucleotide addition (A, then C, then G, then T, then A, then C, then G,then T, then A, then C, then G), followed by 1 cycle of Cy5-dUTPaddition. The fluorescence scan of slide 4 after the labeled dUTP cycle(FIG. 4) shows that dUTP was correctly added to the primer on sample 2.

[0291] The experiments shown in FIGS. 1-4 establish that the fluorescentin situ sequencing extension quantification method permits sequencing ofat least twelve nucleotides on a template contained within apolyacrylamide gel. There was no indication of misincorporation by thepolymerase under these conditions. Further, as shown by the similardetection of signal in each of 5 spots containing a given nucleic acidsequencing template in a given cycle, the sequencing reactions remainedin phase for at least twelve nucleotide additions. There is no reason tobelieve further nucleotide additions would not be possible using thesemethods. In addition, any of the methods described herein below tofurther extend the sequence read length of the FISSEQ method may beused.

[0292] It is recognized that polymerases used for sequencing becomeinefficient for further extension when 100% of bases added to a primerare non-native (i.e., fluorescently labeled). Therefore, the efficiencyof FISSEQ may be further improved by employing a mixture of native andfluorescently labeled dNTP. The mixture allows incorporation of labeledbases at each position without requiring 100% adjacent non-native bases.Also, a photobleaching step after each set of one or more cycles may beincorporated to allow the computational background subtraction to act ona smaller number, with corresponding lower Poisson shot noise.

[0293] As an alternative to photobleaching or computational subtractionof accumulating fluorescence, cleavable linkages between the fluorophoreand the nucleotide may be employed to permit removal of the fluorophoreafter incorporation and detection, thereby setting the sequence up foradditional labeled base addition and detection. As used herein, the term“cleavable linkage” refers to a chemical moiety that joins a fluorophoreto a nucleotide, and that can be cleaved to remove the fluorophore fromthe nucleotide when desired, essentially without altering the nucleotideor the nucleic acid molecule it is attached to. Cleavage may beaccomplished, for example, by acid or base treatment, or by oxidation orreduction of the linkage, or by light treatment (photobleaching),depending upon the nature of the linkage. Examples of cleavable linkagesare described by Shirnkus et al., 1985, Proc. Natl. Acad. Sci. USA 82:2593-2597; Soukupetal., 1995,Bioconjug. Chem.6: 135-138;Shimkusetal.,1986, DNA 5: 247-255; and Herman and Fenn, 1990, Meth. Enzymol. 184:584-588, all of which are incorporated herein by reference.

[0294] As one example of a cleavable linkage, a disulfide linkage may bereduced using thiol compound reducing agents such as dithiothreitol.Fluorophores are available with a sulfhydryl (SH) group available forconjugation (e.g., Cyanine 5 or Cyanine 3 fluorophores with SH groups;New England Nuclear—DuPont), as are nucleotides with a reactive arylamino group (e.g., dCTP). A reactive pyridyldithiol will react with asulfhydryl group to give a sulthydryl bond that is cleavable withreducing agents such as dithiothreitol. An NHS-ester heterobifunctionalcrosslinker (Pierce) is used to link a deoxynucleotide comprising areactive aryl amino group to a pyridyldithiol group, which is in turnreactive with the SH on a fluorophore, to yield a disulfide bonded,cleavable nucleotide-fluorophore complex useful in the methods of theinvention (see, for example, FIG. 5).

[0295] Alternatively, a cis-glycol linkage between a nucleotide and afluorophore can be cleaved by periodate. These are examples of standardcomponents of cleavable cross-linkers used for protein chemistry or forpolyacrylamide gels. In this embodiment, cleavage of the fluorophorecould be done as often as every cycle, or less frequently, such as everyother, every third, or every fifth or more cycles.

[0296] A modified embodiment of FISSEQ that allows longer effectivereads involves extension for a fixed number of cycles with mixtures ofthree native (unlabeled) dNTPs interspersed with pulses of wash, up to adesired length. Following this, one begins cycles of adding onepartially labeled (i.e., mixture of labeled and unlabeled) dNTP at atime. The triple dNTP cycles allow positioning of the polymerase a fixeddistance from the primer and would use alternating sets of triphosphates(e.g., ACG, CGT, ACG, . . .) chosen and concentration optimized toreduce false incorporation and failure to incorporate (Hillebrand etal., 1984, Nucl. Acids. Res. 12: 3155-3171). This allows three timeslonger reads plus any advantage possibly conferred by having fewerpotential misincorporation steps. It is contemplated that if themisincorporation rate (n−1 and extensible n+1 products) can be as low as10-4, then read lengths longer than current electrophoresis-basedmethods are possible.

[0297] Another modification using the triple dNTP cycles is aimed atreducing the background caused by mismatch incorporation. If, forexample, G:T mismatch pairing is a major source of misincorporation(Keohavong et al., 1993, PCR Meth. Appl. 2: 288-292), one should alwaysinclude A with G, since the more stable A:T interaction will be favoredover the less stable G:T interaction. For example, one may alternatetriple mix 1 (dATP, dCTP, dGTP) with triple mix 2 (dCTP, dGTP, dTTP).

[0298] A more conservative version of FISSEQ which can allowdetermination of longer stretches of sequences at a time requiresreplicas of the array, and will be referred to as replica-FISSEQ.Replica arrays for this method may be made by the replica amplificationmethods described herein, or by a microarray spotting method using amicroarray robot. By spotting the same DNA templates in known positionson the slide, the same effect can be obtained as with thereplica-amplified features. In this embodiment, 30 identical arrays aremade using the microarray robot. Stepping through 1 to 30 additions withnative (unlabeled) dNTPs sets up the final base to be assessed for eacharray element (e.g., slide 1 gets zero native base additions, slide 2gets one native base addition, etc.). The final base is assessed by thesequential addition of each fluorescent dNTP as is normally done inminisequencing. Pyrosequencing data (Ronaghi et al., 1998, Science 281:363) has shown that the polymerase extension reactions stay accuratelyin phase through at least 30 cycles of dNTP addition using naturalnucleotides and Klenow exo-polymerase. To read out N bases with thesingle slide method described above requires 4N cycles of nucleotideaddition and washing. The N-slide (triple dNTP, 4 cycles per slide)method (using N replicas), requires 2N(N−1)/3 cycles. The actual readlengths will be more than N bases (1.4N on average due to runs ofidentical bases). The same number of scans are required for the twomethods.

[0299] Several other modifications to the basic method of FISSEQ arecontemplated. For example, a loop may be incorporated into the primer tohelp reduce mispriming events (Ronaghi et al., 1998, Biotechniques 25:876-878, 880-882, and 884). A particularly useful loop structure,described by Hirao et al. (1994, Nucl. Acids Res. 22: 576-582) as“extraordinarily stable,” would have the advantage of having arelatively short stem, lowering the stability of the complementarystrand hairpin, the result being that the asymmetric PCR for the strandthat we want will extend to the correct end more efficiently.

[0300] Another modification would address the difficulty, encountered inmany methods, of sequencing past long repeating stretches. If it isknown that a given array contains many such sequences, one may include adefined regimen (for example, halfway through the whole sequence) ofdeoxy- and dideoxynucleotides to reduce out-of-phase templates. That is,if one knows he or she is sequencing through a repeat of, for example,AC dinucleotides, one may reduce the number of out-of-phase molecules byfollowing a dATP addition with a ddATP addition. Only those moleculeswhich failed to incorporate the deoxy-form of the nucleotide will beavailable to incorporate the dideoxy-form, leading to chain terminationand reduction of that source of background. Clearly, similar regimensmay be devised for repeats involving more than two nucleotides. Itshould be noted that the strategy is not limited to repeats and may beused to extend read length in any situation where most of the sequencesin the array have a block of sequence part of the way through the targetsequence which is known. For example, in an array of targets, mosthaving the unique sequence ACGTA at the same distance from the primer,one may reduce the number of out-of-phase molecules by following a dATPaddition with a ddATP, ddGTP, and ddTTP addition, then dCTP followed byddATP, ddCTP, and ddTTP addition.

EXAMPLE 13

[0301] Gel Sequencing of Amplified Arrav Features Using Dye Terminators

[0302] In addition to the methods of sequencing by hybridization andsequencing by ligation and restriction, it is possible to sequenceamplified features of arrays using fluorescently labeleddideoxynucleoside triphosphates (“dye terminators”) using the Sanger(“dideoxy”) sequencing method (Sanger et al., 1975, J. Mol. Biol.,94:441) and a micro gel system. In this embodiment, the array ofamplified features is created in a linear arrangernent along one edge ofa very thin slab gel or at the edge of a microfabricated array ofcapillaries. DNA molecules of the pool to be sequenced are prepared inany of the same ways as for the random array spot format describedabove, such that each molecule in the pool has a known sequence orsequences at one or both ends which may serve as primer binding sites.The DNA is applied to the slide as in the random array format, exceptthat it is restricted to a thin line, rather than a circular spot.Alternatively, the DNA may be derived as a replica of a line within astandard 2D array, or may be derived as a replica of a line from ametaphase chromosome spread.

[0303] Features of the deposited linear array are then amplified usingany of the methods described above for amplification of spot arrays.This amplification may be linear or exponential, thermocycled orisothermal. Isothermal amplification methods include the Phi29 rollingcircle amplification method (Lizardi et al., 1998, Nature Genetics 19:225-232), reverse transcriptase/T4 DNA polymerase/Kienow/T7 RNApolymerase linear amplification (Phillips and Eberwine, 1996, Methods10: 283-288) and a T7 DNA polymerase/thioredoxin/ssb system (Tabor andRichardson, January 1999 Department of Energy Human Genome ProgramAbstract No. 15).

[0304] The amplified DNA template may be replicated using the methodsdescribed above. This template, which is immobilized either covalently,by entanglement, or by steric hindrance of the gel (or other semi-solid)is then reacted with dye terminators in the presence of the othernecessary components of the dideoxy sequencing method (i.e., primer,dNTPs, buffer and polymerase). It is well known in the art that a numberof polymerases may be used for dideoxy-sequencing, including but notlimited to Kienow polymerase, Sequenase™ or Taq polymerase. A majoradvantage of dye terminators over fluorescently labeled primers (“dyeprimers”) is that the use of dye terminators requires only one reactioncontaining four distinguishably labeled terminators, whereas the use ofdye primers requires four separate reactions which would require fouridentical amplified features and software alignment of thepost-size-separation pattern. It should be noted that dye terminatorsalso exist for RNA polymerase sequencing (Sasaki et al., 1998, Proc.Natl. Acad. Sci. USA 95: 3455-3460). It should also be noted that if thetermination reactions have been performed with the use of primers, thena rare-cutting endonuclease may be used to produce a desired end for thesequencing ladder.

[0305] A miniature gel system appropriate for the gel sequencing oflinear feature arrays has been described by Stein et al., 1998, Nucl.Acids. Res. 26: 452-455. In this system, small, ultrathin polyacrylamidegels are cast, eight or more at a time, on standard microscope slides.The gels may be stored, ready to use, for approximately two weeks. Theyare run horizontally in a standard mini-agarose gel apparatus, withtypical run times of 6 to 8 minutes. Stein et al. describe a novelsample loading system which permits volumes as low as 0.1 μl to beanalyzed. The band resolution compares favorably with that oflarge-format sequencing gels. Within the context of the sequencing oflinear arrays according to the invention, the sample loading isaccomplished by performing the termination reactions within, or at thevery edge of the gel, rather than by mechanical means.

[0306] Since the terminated reaction products remain bound to thetemplate, the reaction may be cleaned of dNTPs, primers and salts bydiffusion, flow and/or electrophoresis. The termination products arethen denatured and electrophoresed perpendicular to the line ofamplified features in a thin slab or capillary format. An importantaspect of this method is that the order of the amplified features ispreserved throughout the process. Thus, if the line of features comesfrom a chromosome or large cloned or uncloned DNA fragment, the longrange order is preserved and greatly aids in the assembly of complexgenomic regions even in the presence of long repeats. Similarly, if thelines of features are derived as replicas of lines from the standard 2Darrays, the sequence identity of each spot in that line may bedetermined. Similar replicas of additional lines from the 2D spot may beused to determine the identity of each spot or feature of the 2D array.In addition to the clear advantages regarding the spatial organizationof the features, this method has the additional advantage of actuallyusing more of the sequencing reaction than other methods. That is, allof the reaction products are electrophoresed, rather than just a portionof it, meaning there is less waste of reagents. Further, theimmobilization of the features allows the use of a common pool ofreagents to sequence many features simultaneously. Thus, the method ismore economical on a per sequence basis.

EXAMPLE 14

[0307] Multiplex PCR

[0308] Multiplex PCR refers to the process of amplifying a number ofdifferent DNA molecules in the same PCR reaction. Generally, the processinvolves the addition of multiple primer pairs, each pair specific forthe amplification of a single DNA target species. A major goal ofinvestigators is to apply the power of multiplex PCR to the problem ofhigh throughput genotyping of individuals for specific genetic markers.If 100,000 polymorphic markers are to be assayed per genome, it would bevery expensive to perform 100,000 individual PCR reactions. Someadvances have been made in multiplexing PCR reactions (Chamberlain etal., 1988, Nucl. Acids Res. 16:11141), and the degree of multiplexing ofthe PCR has been scaled up, followed by hybridization to an array ofallele-specific probes (Wang et al., 1998, Science 280: 1077). However,in the studies by Wang et al., the percentage of PCR products thatsuccessfully amplified decreased as the number of PCR primers added tothe reaction increased. When approximately 100 primer pairs were used,about 90% of the PCR products were successfully amplified. When thenumber of primer pairs was increased to about 500, about 50% of the PCRproducts were successfully amplified.

[0309] The decreasing efficiency with increasing number of primers isdue in large part to the phenomenon of “primer dimer” formation. Primerdimers are the result of fortuitous 3′ terminal complementarity of 4 bpor more between primers. This complemehtarity allows hybridization whichis stabilized by polymerase recognition and extension of both strands.After the first cycle of extension, the complementarity is no longerlimited to the 3′ terminal nucleotides; rather, the entire primer dimeris now complementary to the primers. This reaction efficiently competeswith the desired amplification reaction, in part because theconcentration of the primers is significantly greater than that of thedesired amplification target, kinetically favoring the amplification ofthe primer dimers. This phenomenon increases with increasing numbers andconcentrations of primers.

[0310] A new approach to solving these inherent problems with multiplexPCR uses microarrays of immobilized, amplified PCR primers. Byimmobilizing at least one of the PCR primers, the method reduces thepossibilities for non-specific primer interactions. The localconcentration of primers is high enough for amplification, yet theindividual primers are restricted from interacting non-specifically withone another.

[0311] Another disadvantage of standard multiplex PCR is that individualprimer pairs must be synthesized for each polymorphic target. GenotypingDNA with 100,000 polymorphism targets would require, in theory, 200,000different PCR primers. Not only is the synthesis of such primers costlyand time consuming, but not all primer designs succeed in producing adesired PCR product. Therefore considerable time and energy will bespent optimizing the primer designs.

[0312] According to the new multiplex PCR method, one of the primers hasa 5′ end which is generic for the entire multiplex PCR reaction, suchthat the entire multiplex reaction will have that segment on the“mobile” primer. This 5′ generic sequence may contain a restriction sitefor later cloning, a bacteriophage or other promoter for transcriptionof the products, or some other useful or identifiable sequence. The 3′end of the mobile primer is complementary to any genomic (or cDNA)sequence which is to be amplified at a reasonable PCR distance from the3′ end of the immobile primer. In other words, the 3′ end of the mobileprimer is randomized. The length of the randomized 3′ sequence may be asfew as 5 nucleotides, up to 10 nucleotides or more. The second, or“specific” primers are immobilized (according to methods known in theart or described herein) to keep them from diffusing into the otherprimer pair zones while the mobile primer allows the extended product todiffuse.

[0313] There are at least two ways primer pairs may be distributed.First, two presynthesized Acrydite primers may be codeposited (Kenney etal., 1998, Biotechniques 25: 516-521; Rehman et al., 1999, Nucl. AcidsRes. 27: 649-655), along with template and polymerase, in a gel volumeelement, for example by aerosol, emulsion, or inkjet printer, from anequimolar primer mixture. Alternatively, the primers may be derived fromgenomic DNA by a localized PCR. Generic primers can be used with oneimmobilized primer to make amplified features, and then release the newextended primers by exonuclease or type II restriction enzymes asdescribed elsewhere herein. The new extended primers would then becopolymerized, along with template and polymerase, into the gel.

[0314] The process of this modified multiplex PCR method can be thoughtof as essentially two different steps. In the first, primers immobilizedin a microarray hybridize with their complementary sequence in thetemplate and are extended. In the second, and subsequent steps, the 3′(randomized) end of the mobile primers hybridizes at some point alongthe length of the extended immobilized primer and is itself extended. Insubsequent cycles, other molecules in the immobilized primer featureshybridize with the products of the previous extension, allowingextension, and so on, yielding exponential amplification as in standardPCR.

[0315] The multiplex PCR strategy need not involve replica printing.

EXAMPLE 15

[0316] Amplification of Nucleic Acid Molecules in a Polymer Gel

[0317] According to one aspect of the present invention, an array ofnucleic acid molecules is produced as a result of amplification of aninitial nucleic acid molecule, whether alone or as part of a plasmid, ina polymer gel or other suitable gel matrix which is placed on a solidsupport. The gel matrix advantageously serves to immobilize theamplified nucleic acid molecules whether by covalent interaction orsteric hindrance between the nucleic acid molecules and the gel matrix.Suitable gel matrices within the scope of the present invention includethose prepared by polymerization of one or more commercially availablemonomers such as acrylamide and the like to form a polyacrylamide gelmatrix. One of ordinary skill in the art will readily recognize thatother suitable polymer-based matrices are useful in the practice of thepresent invention. The present invention also includes other gelmatrices such as those made from starches, agarose and the like. As anillustration of one aspect of the present invention, polyacrylamide gelmatrices will be discussed.

[0318] The solid support can be fashioned of any material known to thoseof skill in the art to be suitable in the practice of the presentinvention. The surface of the solid support can optionally be pretreatedin a manner to increase adherence of the polyacrylamide gel to the solidsupport. According to a preferred embodiment, the solid support isfashioned out of glass. A convenient solid support for use with thepresent invention is a glass microscope slide.

[0319] According to a general embodiment of the present invention,acrylamide monomers are polymerized in a liquid mixture containing atleast one standard commercially available or readily manufacturedoligonucleotide primer reagent, such as a PCR primer, and an effectiveamount of template nucleic acid. One of ordinary skill in the art willrecognize that the principles of the present invention apply to singlestranded nucleic acids, double stranded nucleic acids, or triplestranded nucleic acids. For purposes of illustration of the presentinvention, template DNA and PCR reagents will be discussed. According toone embodiment, the PCR primers are present in pairs (at least two) andin amounts sufficient to amplify the DNA template when subject tocertain reaction conditions. The resulting gel matrix is poured onto asolid support which is subjected to conditions sufficient to effectamplification of the DNA template. As the amplification reactionproceeds, the products remain localized near their respective templatesdue in part to the polyacrylamide gel. The amplification reactionresults in an amplified sequence feature consisting of 10⁸ or moreessentially identical molecules.

[0320] According to one aspect of the present invention, one or more ofthe PCR primers includes a linker moiety which covalently reacts withthe chosen monomer during polymerization of the gel matrix. As a result,the PCR primers become covalently bound to and immobilized within thepolymer gel matrix. One such linker moiety for use with polyacryamidegel matrices includes a commercially available linker moiety known asACRYDITE. ACRYDITE is a phosphoroamidite that contains an ethylene groupwhich enters into a free-radical copolymerization with acrylamide. A PCRprimer can be modified to include the ACRYDITE moiety at the 5′ end(Kenney et al., 1998, BioTechniques 25: 516-521). As a result, theamplified DNA in each feature can be covalently attached by one of itsends to the polyacrylamide gel matrix. One of ordinary skill in the artwill become aware of other linker moieties usefill in the presentinvention to covalently bind to the gel matrix of choice based upon thedisclosure presented herein.

Primers

[0321] Primers useful in the practice of the present invention wereobtained from Operon (CA) and are identified below. Certain primers usedfor creation of cassettes had common sequences which are indicated belowby bold type, italicized type, underscored type, or bold-italicizedtype.

[0322] Primers Used for Solid Phase Amplification: Primer OutF 5′-ccacta cgc ctc cgc ttt cct ctc-3′ (SEQ ID NO: 10) Primer OutR 5′-ctg ccccgg gtt cct cat tct ct-3′ (SEQ ID NO: 11) Primer AcrOutF 5′-Qcca cta cgcctc cgc ttt cct ctc-3′ (SEQ ID NO: 12) Primer InF5′-ggg cgg aag ctt gaa gga ggt att-3′ (SEQ ID NO: 13) Primer InR 5′-gcccgg tct cga gcg tct gtt ta-3′ (SEQ ID NO: 14) Primer AcrInF5′-Qggg cgg aag ctt gaa gga ggt att-3′ (SEQ ID NO: 15) Primer PucF:5′-ggg cgg aag ctt gaa gga ggt att taa gga gaa aat acc gca tca gg-3′(SEQ ID NO: 16) Primer PucR1: 5′-gcc cgg tct cga gcg tct gtt tac acc gatcgc cct tcc caa ca-3′ (SEQ ID NO: 17) Primer PucR2: 5′-gcc cgg tct cgagcg tct gtt taa att cac tgg ccg tcg ttt tac aa-3′ (SEQ ID NO: 18) PrimerPucR3: 5′-gcc cgg tct cga gcg tct gtt tac caa tac gca aac cgc ctc tcc-3′(SEQ ID NO: 19) Primer PucNestF: 5′-cca cta cgc ctc cgc ttt cct ctcggg cgg aag ctt gaa gga ggt att-3′ (SEQ ID NO: 20) Primer PucNestR:5′-ctg ccc cgg gtt cct cat tct ctg ccc ggt ctc gag cgt ctg ttt a-3′ (SEQID NO: 21)

[0323] The primers AcrOutF and AcrInF include an ACRYDITE modificationwhich is commercially available from Mosaic Technologies, Inc. (Waltham,Mass., USA). The primers are modified at their 5′ ends with the ACRYDITEmoiety which is designated by the character Q in the sequences listedabove. Since ACRYDITE is a phosphoramidite that contains an ethylenegroup capable of free-radical copolymerization with acrylamide, primersincluding the ACRYDITE moiety will polymerize directly into and becomecovalently bound to the acrylamide gel as it solidifies (Kenney et al.,1998, supra).

Design of Amplification Cassettes

[0324] Amplification cassettes useful in the practice of the presentinvention were prepared. The plasmid pUC19 was amplified in a PCRreaction according to the following method. 50 μl of a PCR mixturecontaining 10 mM Tris-HCl pH 8.3, 50 mM KCl, 0.01% gelatin, 1.5 mMMgCl₂, 200 μM dNTPs, 0.5 μM primer PucF, 0.5 μM primer PucR2, 2 ng pUC19 plasmid, and 2 units Taq (Sigma) was cycled in an MJ Research PTC-100thermocycler. The cycle used was denaturation (1 min at 94° C.), 5cycles (10 sec at 94° C., 10 sec at 55° C., 1 min at 72° C.), 20 cycles(10 sec at 94° C., 1 min at 68° C.), and extension (3 min at 72° C.).The PCR product was purified using Qiaquick PCR purification columns(Qiagen), and resuspended in deionized water.

[0325] Two additional amplification cassettes were created, a 120 bpcassette (CP- 120) and a 514 bp cassette (CP-514), and used to determinethe relationship between the length of the amplification cassette andthe resulting amplified feature diameter. These two cassettes werecreated as described above, except the reverse primers PucRI and PucR3were used instead of PucR2 in the first PCR mixture.

[0326] A further additional 281 bp cassette (CP-281) was alsocreated,and used in replica amplification experiments. CP-281 isidentical to CP-234 expect that it is flanked by two additional primersites. These primer sites allowed a nested solid phase PCR reaction tocreate duplicate amplified feature slides without contamination fromprimer-dimer molecules. CP-2 18 was created by cycling a PCR mixture of10 ng CP-234, 10 mM Tris-HCl pH 8.3, 50 mM KCl, 0.01% gelatin, 1.5 mMMgCl₂, 200 μM dNTP's, 0.5 μM primer PucNestF, 0.5 μM primer PucNestR,and 2 units Taq (Sigma) as follows: denaturation (1 min at 94° C.), 5cycles (10 sec at 94° C., 10 sec at 55 C, 1 min at 72 C), 22 cycles (10sec at 94 C, 1 min at 68 C), and extension (3 min at 72° C.). The PCRproduct was purified using Qiaquick PCR purification columns (Qiagen),and resuspended in deionized water.

Creating Slides of Nucleic Acid Molecules Immobilized in a Gel Matrix

[0327] One aspect of the present invention includes a method of makingan array of nucleic acid molecules that are immobilized in a gel matrix.According to the present invention, a liquid mixture of template DNA, apair of PCR primers, at least one of which primers is optionally 5′ACRYDITE modified, and acrylarnide monomers is prepared. The liquidmixture is poured onto a solid substrate such as a glass slide. Theliquid mixture is then polymerized under suitable conditions. Thetemplate DNA is also amplified by PCR under suitable conditions. Theresult is an array having amplified nucleic acid molecules that areimmobilized. The method is described in greater detail in the followingnon-limiting example.

[0328] To create an array slide according to this aspect of theinvention, template DNA was amplified by PCR in a polyacrylamide gelpoured onto a glass microscope slide. Dilute amounts of template CP-234(0-360 molecules, quantified by ethidium bromide staining and gelelectrophoresis) were added to the solid phase PCR mixture containing 10mM Tris-HCl (pH 8.3), 50 mM KCl, 0.01% gelatin, 1.5 mM MgCl₂, 200 μMdNTP's, 0.5 μM primers, 2 ng pUC19 plasmid, 10 units JumpStart Taq(Sigma), 6% Acrylamide, 0.32% Bis-Acrylamide, 1 μM primer AcrInF, and 1μM primer InR. Two 65 μl frame-seal chambers (MJ research) were attachedto a glass microscope slide that had been pre-treated with bind-silane(Pharmacia). Other types of bind-silane are commercially available fromSigma. Pre-treatment of a glass slide with bind-silane results in theenhanced binding of the polymerized polyacrylamide to the slide.

[0329] 2.5 μl of 5% ammonium persulfate, and 2.5 μl of 5% TEMED wereadded to 150 μl of the solid phase PCR mixture. 65 μl of this solutionwas added to each chamber. The chambers were then immediately coveredwith No. 2 coverslips (Fisher, 18 mm×18 mm), and the gel matrix wasallowed to polymerize for 10-15 minltes. Thermostable,template-dependent DNA polymerases other than JumpStart Taq polymeraseare known to those skilled in the art and are also useful in this, andother aspects of the invention.

[0330] The slide was then cycled using a PTC-200 thermal cycler (MJResearch) adapted for glass slides (16/16 twin tower block). Thefollowing program was used: denaturation (2 min at 94° C.), 40 cycles(30 sec at 93° C., 45 sec at 62° C., 45 sec at 72° C.), extension (2 minat 72° C.). The coverslips were removed and the gels were stained inSYBR green I (diluted 5000 fold in TE, pH 8.0), and imaged on a Stormphosphorimager (Molecular Dynamics) or a confocal microscope (Leica).

Determining Relationship Between Amplified Feature Diameter, TemplateLength and Acrylamide Concentration

[0331] The relationship between amplified feature diameter, templatelength and acrylamide concentration was determined as follows. Slideswere poured in the manner described above. The ratio of bis-acrylamideto acrylamide was 1:19 for all slides poured. After the slides werecycled, the coverslips were removed and the gels were stained as above.The gels were imaged using the Storm phosphorimager. Any gels withamplified features less than 300 μm in diameter were imaged on theconfocal microscope. Care was taken to image only the amplified featuresthat could be completely resolved from other amplified features. Theseimages were captured, and the intensity values saved as a text file. Thedata were smoothed using a 17 point averaging algorithm, and the fullwidth at half maximum of each amplified feature was recorded as itsdiameter.

[0332] Features of a DNA array were amplified on a glass microscopeslide by performing solid phase PCR (see Lockley et al., 1997, Nucl.Acids Res. 25: 1313-1314) in an acrylamide gel. The general design ofthe template DNA cassettes used to create the amplified feature arrayslide is shown in FIG. 7. The template DNA includes binding sites forthe pair of PCR primers, one on either side of a sequence of interest.For most applications, the sequence of interest will be a variableregion, with the variable region of each cassette molecule containing adifferent DNA fragment. This complex template library will containsequences derived from the genome or cDNA of the organism of interestflanked by constant regions that allow PCR amplification (Singer et al.,1997, Nucl. Acids Res. 25: 781-786). However, to demonstrate andoptimize the in vitro cloning of DNA, only one species of DNA was usedin the solid phase PCR: the cassette CP-234, a 234 base pair templatederived from the plasmid pUCl9. Very dilute amounts of the template DNACP-234 were included in a PCR mix that contained 6% acrylamide and 0.3%bis-acrylamide. This mix was then used to pour a thin (250 μm)acrylamide gel on top of a glass microscope slide. One of the primersincluded in the mix contained an ACRYDITE group at its 5′ end, so thatit was immobilized in the acrylamide matrix when the gel polymerized.Solid phase PCR (so named because one of the primers is immobilized to asolid support) was performed by thermal cycling of the slide. The gelswent through 40 cycles of denaturation, annealing and extension, andwere stained using SYBR Green I.

[0333] Upon imaging, green fluorescent spheres were seen in the gelsthat had been poured with template DNA (FIG. 8A). These spheres were notseen in the control slide lacking template DNA. The spheres were uniformin shape and roughly 300 μm in diameter, with little variation in size.The number of fluorescent spheres shows a linear dependence on thenumber of template molecules added (FIG. 8B).

[0334] In order to confirm that the fluorescent spheres were DNAfeatures which were amplified from a single molecule of the templatecassette CP-234, stained spheres were removed using a toothpick andplaced into a tube containing a PCR mixture, and the mix was thermalcycled. As a negative control, regions of the gel that did not containfluorescent spheres were also removed using a toothpick, mixed with aPCR mixture and thermal cycled. The reactions were then run out on anagarose gel. The results are shown in FIG. 8C. The sample containing thestained spheres clearly showed products at 234 bp as expected, while thesample containing regions of the gel that showed no spheres yielded noproduct.

[0335] While not wishing to be bound by any scientific theory, it isbelieved that the stained spheres shown in FIG. 8A are due to theamplification of single template molecules. First, the number ofamplified features obtained in each reaction is linearly dependent onthe amount of template included. As seen in FIG. 8B, eighty percent ofthe template molecules added to each reaction yielded amplifiedfeatures. Less than one hundred percent efficiency is believed to be dueto possible damage to template molecules by the free radicals generatedduring the acrylamide polymerization, loss of template molecules toabstraction by tube or pipette tip walls, or the amount of template mayhave been underestimated when quantified by ethidium bromide staining.Second, amplified feature-picking experiments confirmed that product ofexpected length can be produced. Third, as shown in FIG. 4, amplifiedfeature size is strongly dependent on the length of the template.

[0336] In some experiments, a few larger fluorescent spheres (1-2 mm indiameter) were observed. Because these spheres were also observed onslides that were poured without template DNA, it was suspected thatthese spheres were the result of primer-primer mispriming (primerdimer). This was confimned by repeating the sphere-picking experimentdescribed above on the putative primer-dimer spheres (data not shown).Primer dimer spheres or features can be reduced or eliminated by raisingthe annealing temperature of the PCR and/or by careful primer design asknown by those skilled in the art.

[0337] Because the number of amplified features per slide goes up withthe inverse square of the feature size, it is necessary to minimize thesize of each amplified feature in order to obtain slides with as manyamplified features as possible. In order to determine the parametersthat influence amplified feature size, solid phase PCR reactions wereperformed using template cassettes of different lengths. Acrylamideconcentration was also varied. The results are shown in FIG. 9.

[0338] The results, shown in FIG. 9A, show that amplified feature radiusdecreases as template length increases and as the acrylamide percentageincreases. Using the 514 base pair template, CP-514, and an acrylanideconcentration of 15%, the amplified features produced were very small(average radius of 12.5 μm), and of uniform size (standard deviation of0.29 μm).

[0339] These results showed that amplified feature radius was verysensitive to length of the template. In order to further minimizeamplified feature size, a template cassette was created that was 1009base pairs long. When this cassette was used as template in a solidphase PCR in 15% acrylamide, the resulting amplified features had radiiof approximately 6 μm (FIG. 9B). At this size, it is estimated that 5million distinguishable amplified features can be poured on a singleslide based on over 13.5 million being actually poured on the slide butthat 63% of these will overlap one another. It is believed thatamplified feature radius could be further reduced by increasing thelength of the template DNA, by using fewer cycles of PCR, or byimmobilizing both primers.

[0340] A simulation of amplified feature growth was developed toinvestigate the apparent relationship shown in FIG. 4A between featuresize and variation in size. This model assumes that at each cycle in thePCR reaction, every DNA molecule will move in a stochastic fashion (dueto thermal energy) and then give rise to a complementary strand. Theprobability that a given molecule will give rise to a complementarystrand is dependent on the number of unextended primers and the numberof complementary strands in the immediate vicinity of the DNA. Thismodel was tested using a number of different probability distributionfunctions for DNA motion with all runs being assumed that the DNA doesnot travel too far in relation to the average distance betweenimmobilized primers. In all cases the results were qualitativelysimilar. This model predicts that template amplification in each featureis exponential during the early amplification cycles. As the amplifiedfeature grows, it will reach a certain radius, the critical radius,after which the amplification proceeds at a polynomial rate. Thecritical radius is dependent on the diffusion coefficient of thetemplate molecule, and the probability that a given DNA molecule isreplicated after one cycle of the solid phase PCR. While not wishing tobe bound by any one theory, one possible explanation is that one of theprimers in the reaction is immobilized. Therefore, for an amplifiedfeature to achieve exponential amplification, one strand of each fulllength DNA product in the feature must diffuse and anneal to animmobilized primer at each round of amplification. In this theory,during the early rounds, most of the immobilized primers in the vicinityof a template have not yet been extended, so the total number of DNAmolecules in a feature increases exponentially with the cycle number.However, at later rounds, the DNA at the center of the feature cannotdiffuse far enough to find immobilized primer that has not yet beenextended. So, only the DNA near the circumference of the feature cancontinue to amplify. Therefore, the number of new DNA moleculesgenerated with each cycle increases as the square of the cycle number,so that the total number of DNA molecules in the feature increases withthe cube of the cycle number.

[0341] Accordingly, it is possible, for example, that when the long DNAtemplate, CP-514, was amplified to form amplified features, the featuresreached their critical radii and then grew very slowly for the rest ofthe reaction. Therefore, all of the amplified features tended to be thesame size. In contrast, it is also possible that when the short DNAtemplate, CP-120, was used, the features never reached their criticalradii, so that some amplified features were bigger or smaller thanothers due to the stochastic nature of PCR.

EXAMPLE 16

[0342] Duplicating Array Slides

[0343] One aspect of the invention encompasses a method of making aplurality of arrays from a single array having nucleic acid moleculesimmobilized in a polyacrylamide gel. According to the method of thepresent invention, a liquid mixture of template DNA, a pair of PCRprimers, at least one of which primers is 5′ ACRYDITE modified, andacrylamide monomers is poured onto a solid substrate, such as a glassmicroscope slide, and then polymerized under suitable conditions to forma first layer. A liquid mixture of a pair of PCR primers, at least oneof which primers is optionally 5′ ACRYDITE modified, and acrylamidemonomers without template DNA is poured on top of the first layer, andthen polymerized to form a second layer. The template DNA is thenamplified under suitable conditions to generate a nucleic acid arraywhich is immobilized in the polyacrylamide gel matrix. Because thesecond layer is held in contact with the first layer during theamplification, a portion of the amplified nucleic acids from the firstlayer are transferred to the second layer whether by diffusion,adhesion, covalent bonding or other mechanism. The second layer is thenremoved and the process repeated as many times as desired to generate aplurality of arrays. The method is described in greater detail in thefollowing non-limiting example.

[0344] To duplicate arrays of the present invention containingimmobilized nucleic acids, a sandwich of two layers of acrylamide, the“transfer layer” and the “readout layer” is prepared. To create thetransfer layer, template DNA is added to a solid phase PCR mix (10 mMTris-HCl (pH 8.3), 50 mM KCl, 0.01% gelatin, 1.5 mM MgCl₂, 200 μMdNTP's, 0.5 μM primers, 2 ng pUC19 plasmid, 10 units JumpStart Taq(Sigma), 6% Acrylamide, 0.32% Bis-Acrylamide, 1 μprimer AcrOutF, 1 μMprimer OutR). Ten microliters of this solution are then pipetted onto aclean coverslip (18mm×18mm), and the coverslip is picked up by abind-silane treated slide. The slide is placed in an argon atmosphere topromote polymerization of the acrylamide. The coverslip is then removed,leaving a gel that is approximately 32 μm thick. To pour the readoutlayer, a fresh solid phase PCR mix is made; however, no template isadded to this mixture. A frame seal chamber is then placed over thetransfer layer, and, using a bind-silane treated glass coverslip, thereadout layer (250 μm) is poured over the 32 μm transfer layer. Theslide is then thermal cycled as described above.

[0345] When the coverslip is carefully removed from the top of the frameseal chamber, the readout layer will stick to the coverslip, while thetransfer layer will be left on the slide. The readout layer can then bestained with SYBR Green I and imaged. The transfer layer is then used tomake duplicates. To do so, the slide is washed 2× in 10 mM Tris-HCl, 2×in 500 mM KCl, 2× in 10 mM Tris, 100 mM KCl, and 2× in dH20. Theduplicate gel is then made by placing a frame seal chamber (15 mm×15 nm)over the transfer layer, and pipetting 65 μl of the duplicatesolid-phase PCR mix (10 mM Tris-HCl pH 8.3,50 mM KCl, 0.01% gelatin, 1.5mM MgCl₂, 200 μM dNTP's, 0.5 μM primer AcrInF, 0.5 μM primer InR, 10units JumpStart Taq (Sigma), 6% Acrylamide, 0.32% Bis-Acrylamide), ontothe transfer layer. The duplicate slide is then cycled as follows:denaturation (2 min at 94° C.), 25 cycles (30 sec at 93° C., 45 sec at62° C., 45 sec at 72° C.), extension (2 min at 72° C.). Because thecoverslip used to pour the duplicate gel was not treated withbind-silane, the gel stuck to the transfer layer when the coverslip wasremoved; therefore when the duplicate was stained and imaged, theamplified feature pattern of the array was rotated 180 degrees from thatof the readout layer.

[0346] According to the above protocol, a DNA array slide was created bypouring a thin, 3.1 μm gel containing template DNA (the template ortransfer layer) on a bind silane-treated glass microscope slide, andthen pouring a thicker gel (250 μm) over it, the thicker gel lackingtemplate DNA but containing primers. When the sandwich is thermalcycled, the DNA in the thin layer produces amplified DNA features thatspan the interface between the two gels.

[0347] When the coverslip was carefully removed from the microscopeslide, the thick gel remained intact and attached to the coverslip. Thisgel was stained with SYBR Green I and saved for comparison with theduplicate. Because the surface of the slide was treated with bind silanebefore the original was poured, the 3.1 μm layer of acrylamide (thetemplate layer) remained bound to the surface of the slide. The slidewas washed, and a new gel, the “duplicate,” was poured on this glassslide. The duplicate was then thermal cycled and stained.

[0348]FIG. 10 shows the imaged original slide (A) and duplicateamplified feature slide (B). The duplicate slide exhibited an amplifiedDNA feature pattern that is identical to that of the original. Theamplified DNA features on the duplicate tend to be slightly larger thanthose on the original due to diffusion in the duplicate solid phase PCRreaction.

EXAMPLE 17

[0349] Fluorescent in Situ Sequencing Extension Quantification withCleavable Linkers

[0350] The method of sequencing nucleic acid molecules within apolyacrylamide gel matrix using the Fluorescent In Situ SequencingExtension Quantification method and nucleotides labeled with cleavablelinkers was demonstrated in the following experiments.

[0351] In order to evaluate the method, molecules of a known DNAsequence were first cast into a polyacrylarnide gel matrix. Theoligonucleotide sequencing primer RMGP 1-R (5′-gcc egg tct cga gcg tctgtt ta) was annealed to the oligonucleotide puc514c (Q - 5′ tcggccaacgcgcggg gagaggcggt ttgcgtatca g taaacagac gctcgagacc gggc (sample 1))or to the oligonucleotide puc234t (Q-5′ cccagt cacgacgttg taaaacgacggccagtgtcg a taaacagac gctcgagacc gggc (sample 2). The bolded sequencesdenote the sequences to which the sequencing primer anneals, and Qindicates an ACRYDITE modification.

[0352] Equal amounts of template and primer were annealed at a finalconcentration of 5 μM in 1× EcoPol buffer (10 mM Tris pH 7.5, 5 mMMgCl₂), by heating to 95 degrees C for 1 minute, slowly cooling to 50degrees C. at a rate of 0.1 degrees per second, and holding the reactionat 50 degrees C. for 5 minutes. The primer:template complex was thendiluted by adding 30 μl 1× Ecopol buffer and 2 μl 500 mM EDTA.

[0353] One microliter of each annealed oligonucleotide was added to 17μl of acrylamide gel mixture (40 mM Tris pH 7.3, 25% glycerol, 1 mM DTT,6% acrylamide (5% cross-linking), 17.4 units SEQUENASE version 2.0(United States Biochemical, USB), 15 μg/ml E. coli single strandedbinding protein (USB), 0.1 mg/ml BSA). Then, 1 μl of 1.66% TEMED and 1μl of 1.66% APS were added and 0.21 μl of each mixture was pipetted ontobind-silane treated glass microscope slides. The slides were immediatelyput under an argon bed for 30 minutes to allow polymerization of theacrylamide.

[0354] The slides containing the spots of polyacrylamide containing DNAmolecules to be sequenced were then washed in 40 mM Tris pH 7.5, 0.01%Triton X-100 for 30 seconds, after which they were ready for theincorporation of labeled nucleotides. For this experiment, dCTP labeledwith the fluorophore Cy5 with either a non-cleavable linkage (referredto herein as Cy5-dCTP) or with a disulfide-containing cleavable linkage(referred to herein as Cy5-SS-dCTP) was used. The acrylamide spotscontaining known DNA to be sequenced were incubated in 30 μl of Cy-5dCTP extension mix (10 mM Tris pH 7.5 50 mM NaCl, 5mM MgCl₂, 0.1 mg/mlBSA, 0.01% Triton X-100, 0.1 μM unlabeled dCTP, 0.2 μM Cy5-dCTP) or inCy-5-SS-dCTP extension mix (10 mM Tris pH 7.5 50 mM NaCl, 5 mM MgCl₂,0.1 mg/ml BSA, 0.01% Triton X-100, 0.1 μM unlabeled dCTP, 0.2 μMCy5-SS-dCTP) for4 minutes at room temperature. The slides were washedtwice, for 5 minutes each in FISSEQ wash buffer (10 mM Tris pH 7.5,250mM NaCl, 2mM EDTA, 0.01% Triton X-100), spun briefly to dry andscanned on a Scanarray 4000 confocal scanner (GSI Luminomics). Thesettings were as follows: Focus=2060, Laser=80%, PMT=80% resolution=30microns.

[0355] Cleavage of the cleavable disulfide linkages was performed byincubation with the reducing agent dithiothreitol (DTT). The slides wereincubated overnight in FISSEQ wash buffer supplemented with 5 mM DTT,washed twice for 5 minutes each in wash buffer, spun briefly to dry andscanned as before. FIG. 6 shows the results of this experiment. Sample 1incorporated both the cleavable and the non-cleavable fluorescentlylabeled nucleotide (see “Before DTT Wash” panels), while sample 2 didnot, as was expected since only sample 1 had a G as the next templatenucleotide. DTT wash (bottom panels) removed the fluorescent signal fromthe samples extended with the Cy5-SS-dCTP sample, but not from thesamples extended with the non-cleavably linked fluorophore,demonstrating that the cleavable linkages could be cleaved, orchemically bleached, from the Cy5-SS-dCTP-extended samples with reducingagent, but not from the Cy5-dCTP-extended samples. One of skill in theart would fully expect similar cleavable linkages to nucleotides otherthan dCTP (for example, dATP, dGTP, TTP or even ribonucleotides orfurther modified ntcleotides) to function in a similar manner.

EXAMPLE 18

[0356] Enhancing the Performance of Nucleic Acid Sequencing inPolyacrylamide-Immobilized Arrays.

[0357] Polyacrylamide-immobilized nucleic acid arrays and replicasthereof, made as described herein above or through other methodologies,are useful as platforms for simultaneously sequencing the large numberof different DNA molecules comprising the array. In particular, theFISSEQ methods described herein above, in all variations, are usefulapproaches to sequencing DNAs in polyacrylamide-immobilized arrays.There are a number of parameters of the polyacrylamide gels andsequencing conditions that may be modified to enhance the performance ofthe FISSEQ method (also referred to as ISAS, or “In Situ Amplificationand Sequencing) when performed on polyacrylamide-immobilized arrays.

[0358] One parameter that can be modified is the pore size of the gel.Larger pore size allows the polymerase(s) used for thermal cycling,sequencing, or both, to diffuse more freely and access the primedtemplate. In the sequencing reactions, increased pore size increases theefficiency of base addition so that rapid “dephasing” or loss ofsynchrony of the template strands is prevented. Depending on thecrosslinker and total acrylamide concentration, standard acrylamide poresizes are generally about 5 to about 20 nanometers. For example, in gelswith 5% total acrylamide and 4% bis-acrylamide cross linker, the poresize is about 5 nm. There are several methods known for creatingso-called “macroporous” polyacrylamide gels, with pores of about 100 nmto about 600 nm in diameter. As used herein, the term “macroporouspolyacrylamide gel” refers to a polyacrylamide gel with pore size ofabout 25 to 600 nm in diameter, with a preferred range of about 100 toabout 600 nmn.

[0359] First, polyethylene glycol (PEG) may be added to the gel. See forexample, Righetti et al., 1992, Electrophoresis 13: 587-595,incorporated herein by reference, which describes gel polymerization inthe presence of “laterally aggregating agents” such as PEG to increasepore size. A preferred preparation uses 6% acrylamide, 1.5% cross-linker(e.g., bis-acrylamide), with 2.5% PEG (10 kDa polymer size). The totalacrylamide may be varied over a range from about 3% to about 12%, andthe cross-linker may vary from about 1% to about 30%. All percentagesare weight per volume. In these formulations, the PEG may be varied from0% to about 25%, with the polymer size of the PEG molecules varying fromabout 1 kDa to about 20 kDa. Generally, the longer the PEG chain length,the lower the percentage of PEG needed to increase the pore size. Theinclusion of PEG in the polyacrylamide gel results in pores up toapproximately 100 times the size of those achievable using acrylamidealone.

[0360] Alternatively, N,N′-diallyltartardiamide (DATD) may be used asthe cross linking agent. See for example, Spath and Koblet, 1979, Anal.Biochem. 93: 275-285, incorporated herein by reference, which comparesDATD-cross-linked gels to Bis-acrylamide cross-linked gels.

[0361] As another alternative, it is known that polymerization at lowtemperatures results in larger pore sizes in polyacrylamide gels.Standard practice for polyacrylamide gel polymerization is to performthe reaction at room temperature. However, polymerization at 4° C.produces a gel with larger pore sizes compared to a gel of the samecomposition polymerized at room temperature. Generally, lower or reducedtemperatures for gel polymerization include a range from about 0° C. toabout 15° C., with a temperature of about 2 C to about 4 C beingpreferred. Polymerization at 4° C. in a 5% total acrylamide, 4%bis-acrylamide gel, for example, results in a pore size of about 30 nm,compared to pores of about 5-20 nm when the same gel is polymerized atroom temperature (i.e., about 21° C.).

[0362] As another alternative, increasing the percentage of cross-linker(e.g., bis-acrylamide) in the acrylamide monomer solution is also knownto result in a gel with larger pore size relative to gels formed withlower percentages of cross-linker (see Righetti et al., 1981, J.Biochem. Biophys. Meth. 4: 347-363, which is incorporated herein byreference). As noted above, cross-linker may be varied from about 1% toabout 30%, with higher percentages yielding greater pore sizes.

[0363] In addition to gel pore size, another parameter that can bemanipulated to enhance the efficiency of sequencing reactions inpolyacrylamide array gels is the amount of secondary structure of thetemplate DNAs. For example, single-stranded binding protein (SSBP) maybe added to the sequencing reaction in order to reduce the amount ofsecondary structure of the template molecules. Reduced secondarystructure reduces pausing by the polymerase that can contribute todephasing of the reactions on an array. Generally, E. coli SSBP (U.S.Biochemical) is added to the sequencing reactions at concentrationsranging from about 1 μM to about 5 μM.

[0364] Salt conditions are also important in the amount of templatesecondary structure and may be varied to enhance sequencing efficiencyon polyacrylamide-immobilized arrays. Generally, intramolecularinteractions contributing to secondary structure are reduced as saltconcentration is decreased. It is acknowledged that differentpolymerases useful in the methods of the invention can have differentsensitivities to and requirements for salt concentrations. One of skillin the art is readily able to determine the effect of decreasing saltconcentration on a given polymerase with respect to sequencing fidelityand efficiency. Useful salt concentrations generally range from about 2to about 10 mM MgCl₂ and about 0 to 100 mM NaCl. Exemplary saltconditions for sequencing include the following: for Klenow fragment ofE. coli DNA polymerase, 10 mM MgCl₂, without any NaCl; for Sequenase, 50mM NaCl and 5 mM MgCl₂; for Bst polymerase, 50 mM NaCl and 5 mM MgCl₂.

[0365] Preferred conditions for sequencing polyacrylamide-immobilizedDNA array features include 50 mM NaCl, and 5 μM SSBP, at roomtemperature using 0.5 μM Sequenase.

[0366] The temperature of the reaction may also be varied to enhance theefficiency of DNA sequencing reactions within the gel, as this alsoaffects the secondary structure of the template molecules. Generally,the secondary structure is reduced as the temperature of the reaction isincreased. It is helpful, therefore, to use a thermostable polymerasesuch as Bst polymerase (New England Biolabs) or Thermosequenase(Amersham).

[0367] When using higher temperatures for sequencing reactions it ishelpful or sometimes even necessary to increase the length of thesequencing primer or the G+C content of the primer/primer bindingsequences in order to determine the maximum temperature (T_(m)) at whichprimer annealing is maintained while reducing intramolecular templatesecondary structure. One of skill in the art may calculate the T_(m) fora given oligonucleotide primer at a given salt concentration. As anexample, however, for primers greater than 10 bases in a 50 mM saltsolution (standard PCR conditions), T_(m) may be estimated using theformula T=59.9+41 [% G+C (decimal value)]−[675/primer length].

EXAMPLE 18 Use

[0368] The invention is useful for generating sets each comprising aplurality of copies of a randomly-patterned, immobilized (thus highlyreusable) nucleic acid arrays from a first array upon which themolecules of a nucleic acid pool are randomly positioned quickly,inexpensively and from unique pools of nucleic acid molecules, such asbiological samples. The sets of arrays, and members of such sets,produced according to the invention are useful in expression analysis(Schena, et al., 1996, Proc. Nat. Acad. Sci. U.S.A., 93: 10614-10619;Lockhart, et al., 1996, Nature Biotechnology, 14: 1675-1680) and geneticpolymorphism detection (Chee et al., 1996, Science, 274(5287): 610-614).They are also of use in DNA/protein binding assays and more generalprotein array binding assays. The methods of the invention are alsouseful for determining the sequences of nucleic acids on arrays.

OTHER EMBODIMENTS

[0369] Other embodiments will be evident to those of skill in the art.It should be understood that the foregoing description is provided forclarity only and is merely exemplary. The spirit and scope of thepresent invention are not limited to the above examples, but areencompassed by the following claims.

1 24 1 17 DNA Artificial Sequence amplification primer 1 taatacgactcactata 17 2 10 DNA Artificial Sequence amplification primer 2tgcatgctat 10 3 25 DNA Artificial Sequence amplification primer 3atagcatgca atgcatttac gtagc 25 4 32 DNA Artificial Sequenceamplification primer 4 gcagcagtac gactagcata tccgacnnnn nn 32 5 32 DNAArtificial Sequence amplification primer 5 cgatagcagt agcatgcaggtccgacnnnn nn 32 6 66 DNA Artificial Sequence amplification primer 6tcggctcatc tgcatgctgc cagcagtcgg actacgtacc ccggtacgtg cgctacacgc 60agcttt 66 7 88 DNA Artificial Sequence amplification primer 7 gcagcagtacgactagcata tccgacctgc gtgtagcgca cgtaccgggg tacgtagtcc 60 gactgctggcagcatgcaga tgagccga 88 8 94 DNA Artificial Sequence amplification primer8 cgatagcagt agcatgcagg tccgaccagc agtcggacta cgtaccccgg tacgtgcgct 60acacgcaggt cggatatgct agtcgtactg ctgc 94 9 94 DNA Artificial Sequenceamplification primer 9 gcagcagtac gactagcata tccgacctgc gtgtagcgcacgtaccgggg tacgtagtcc 60 gactgctggt cggacctgca tgctactgct atcg 94 10 24DNA Artificial Sequence amplification primer 10 ccactacgcc tccgctttcctctc 24 11 23 DNA Artificial Sequence amplification primer 11 ctgccccgggttcctcattc tct 23 12 24 DNA Artificial Sequence amplification primer 12ccactacgcc tccgctttcc tctc 24 13 24 DNA Artificial Sequenceamplification primer 13 gggcggaagc ttgaaggagg tatt 24 14 23 DNAArtificial Sequence amplification primer 14 gcccggtctc gagcgtctgt tta 2315 24 DNA Artificial Sequence amplification primer 15 gggcggaagcttgaaggagg tatt 24 16 47 DNA Artificial Sequence amplification primer 16gggcggaagc ttgaaggagg tatttaagga gaaaataccg catcagg 47 17 44 DNAArtificial Sequence amplification primer 17 gcccggtctc gagcgtctgtttacaccgat cgcccttccc aaca 44 18 47 DNA Artificial Sequenceamplification primer 18 gcccggtctc gagcgtctgt ttaaattcac tggccgtcgttttacaa 47 19 45 DNA Artificial Sequence amplification primer 19gcccggtctc gagcgtctgt ttaccaatac gcaaaccgcc tctcc 45 20 48 DNAArtificial Sequence amplification primer 20 ccactacgcc tccgctttcctctcgggcgg aagcttgaag gaggtatt 48 21 46 DNA Artificial Sequenceamplification primer 21 ctgccccggg ttcctcattc tctgcccggt ctcgagcgtctgttta 46 22 23 DNA Artificial Sequence amplification primer 22gcccggtctc gagcgtctgt tta 23 23 60 DNA Artificial Sequence amplificationprimer 23 tcggccaacg cgcggggaga ggcggtttgc gtatcagtaa acagacgctcgagaccgggc 60 24 60 DNA Artificial Sequence amplification primer 24cccagtcacg acgttgtaaa acgacggcca gtgtcgataa acagacgctc gagaccgggc 60

1. A method of making an immobilized nucleic acid molecule arraycomprising: a) providing an immobilized array of spots of a nucleic acidcapture activity wherein: i) said spots are separated by a distancegreater than the diameter of said spots; and ii) the size of said spotsis less than the diameter of the excluded volume of said nucleic acidmolecule to be captured; and b) contacting said array of spots of anucleic acid capture activity with an excess of nucleic acid moleculescapable of being bound by said nucleic acid capture activity, saidnucleic acid molecules having an excluded volume diameter greater thanthe diameter of said spots, resulting in an immobilized nucleic acidarray in which each said spot of said nucleic acid capture activity canbind only one of said nucleic acid molecules having an excluded volumegreater than the size of said spots.
 2. The method of claim 1 whereinsaid nucleic acid capture activity is selected from the group consistingof: a hydrophobic compound; an oligonucleotide; an antibody or fragmentof an antibody; a protein; a peptide; an intercalator; biotin; andavidin or streptavidin.
 3. The method of claim 1 wherein saidimmobilized array of spots of a nucleic acid capture activity arearranged in a predetermined geometry.
 4. The method of claim 1 whereinsaid spots of nucleic acid capture activity are aligned with othermicrofabricated features.
 5. A method of making a plurality of a nucleicacid array wherein said nucleic acid array is produced according to themethod of claim
 1. 6. A method for the detection of a nucleic acid on anarray of nucleic acid molecules, said method comprising: a) generating aplurality of a nucleic acid molecule array wherein the nucleic acidmolecules of each member of said plurality occupy positions whichcorrespond to those positions occupied by the nucleic acid molecules ofeach other member of said plurality of a nucleic acid array; and b)subjecting one or more members of said plurality, but at least one lessthan the total number of said plurality to a method of signal detectioncomprising a signal amplification method which renders said member ofsaid plurality of a nucleic acid array non-reusable.
 7. The method ofclaim 6 wherein said signal amplification method comprises fluorescencemeasurement.
 8. The method of claim 6 wherein said method of detectionof a nucleic acid on an array of nucleic acid molecules detects theamount of an RNA expressed in a first RNA-containing nucleic acidpopulation relative to that expressed in a second RNA-containing nucleicacid population, said method further comprising the steps of: a)preparing a first fluorescently labeled cDNA population using said firstpopulation of RNA-containing nucleic acid as a template; b) preparing asecond fluorescently labeled cDNA population using said secondpopulation of RNA-containing nucleic acid as a template, said secondfluorescently labeled CDNA population being labeled with a fluorescentlabel distinguishable from that used to label said first population; c)contacting a mixture of said first fluorescently labeled CDNA populationand said second fluorescently labeled cDNA population with a member ofsaid plurality of nucleic acid arrays under conditions which permithybridization of said fluorescently labeled cDNA populations withnucleic acids immobilized on said members of said plurality of nucleicacid arrays; d) detecting the fluorescence of said first fluorescentlylabeled population of cDNA and the fluorescence of said secondfluorescently labeled population of cDNA hybridized to said member ofsaid plurality of nucleic acid arrays, wherein the relative amount ofsaid first fluorescent label and said second fluorescent label detectedon a given nucleic acid feature of said array indicates the relativelevel of expression of RNA derived from the nucleic acid of that featurein the mRNA-containing cDNA populations tested.
 9. The method of claim 6wherein said method of detection of a nucleic acid on an array ofnucleic acid molecules measures the amount of an MRNA expressed in afirst mRNA-containing nucleic acid population relative to that expressedin a second mRNA-containing nucleic acid population, said method furthercomprising the steps of: a) preparing a first fluorescently labeled cDNApopulation using said first population of mRNA-containing nucleic acidas a template; b) preparing a second fluorescently labeled cDNApopulation using said second population of MRNA-containing nucleic acidas a template; c) contacting said first fluorescently labeled cDNApopulation with one member of a plurality of immobilized nucleic acidarrays under conditions which permit hybridization of said fluorescentlylabeled cDNA population with nucleic acid immobilized on said member ofa plurality of immobilized nucleic acid arrays; d) contacting saidsecond fluorescently labeled cDNA population with another member of thesame plurality of immobilized nucleic acid arrays used in step (c) underconditions which permit hybridization of said fluorescently labeled cDNApopulation with nucleic acid immobilized on said member of a pluralityof immobilized nucleic acid arrays; e) detecting the intensity offluorescence on each member of said plurality contacted with afluorescently labeled cDNA population in steps (c)-(d); and f) comparingthe intensity of fluorescence detected in step (e) on each member ofsaid plurality of immobilized nucleic acid arrays so tested, todetermine the relative expression of mRNA derived from those nucleicacids on the array in the mRNA-containing cDNA populations tested.
 10. Amethod of preserving the resolution of nucleic acid features on a firstimmobilized array during cycles of array replication, said methodcomprising the following steps: a) amplifying the features of a firstarray to yield an array of features with a hemispheric radius, r, and across-sectional area, q, at the surface supporting said array, such thatsaid features remain essentially distinct; b) contacting said array offeatures with a radius, r, with a support, maintained at a fixeddistance from said first array, said fixed distance less than r, andsuch that the cross-sectional area of the hemispheric feature, measuredat said fixed distance from the surface supporting said first array isless than q, and such that at least a subset of nucleic acid moleculesproduced by said amplifying are transferred to said support; c)covalently affixing said nucleic acid molecules to said support to forma replica of said first immobilized array, wherein the positions of saidnucleic acid molecules on said replica correspond to the positions ofsaid nucleic acid molecules of said first array from which they wereamplified, and wherein the areas occupied on the surface of said supportby the individual features of said replica are less than the areasoccupied on the surface supporting said first immobilized array.
 11. Themethod of claim 10 wherein said amplifying is performed by PCR.
 12. Themethod of claim 10 wherein cycles of said steps (a)-(c) are repeated.13. A method for determining the nucleotide sequence of the features ofan immobilized nucleic acid array, said method comprising the steps of:a) ligating a first double-stranded nucleic acid probe to one end of anucleic acid of a feature of said array, said first double strandednucleic acid probe having a restriction endonuclease recognition sitefor a restriction endonuclease whose cleavage site is separate from itsrecognition site and which generates a protruding strand upon cleavage;b) identifying one or more nucleotides at the end of said polynucleotideby the identity of the first double stranded nucleic acid probe ligatedthereto or by extending a strand of the polynucleotide or probe; c)amplifying the features of said array using a primer complementary tosaid first double stranded nucleic acid probe, such that only moleculeswhich have been successfully ligated with said first double strandednucleic acid probe are amplified to yield an amplified array; d)contacting said amplified array with support such that at least a subsetof nucleic acid molecules produced by said amplifying are transferred tosaid support; e) covalently attaching said subset of nucleic acidmolecules transferred in step (d) to said support to form a replica ofsaid amplified array; f) cleaving the nucleic acid features of the arraywith a nuclease recognizing said nuclease recognition site of said probesuch that the nucleic acid of the features is shortened by one or morenucleotides; and g) repeating steps (a)-(f) until the nucleotidesequences of the features of said array are determined.
 14. The methodof claim 13 wherein said nucleic acid probe comprises four components,each component being capable of indicating the presence of a differentnucleotide in said protruding strand upon ligation.
 15. The method ofclaim 14 wherein each of said components of said probe is labeled with adifferent fluorescent dye and the different fluorescent dyes arespectrally resolvable.
 16. The method of claim 13 wherein after saidstep (e) and before said step (f), the features of said array areamplified.
 17. The method of claim 13 wherein amplification is performedby PCR.
 18. The method of claim 13 wherein: i) after one or more cyclesusing said first double stranded nucleic acid probe in step (a), adistinct nucleic acid probe is used, in place of said first doublestranded nucleic probe in step (a), said distinct nucleic acid probecomprising a restriction endonuclease recognition site for a restrictionendonuclease whose cleavage site is separated from its recognition site,said distinct nucleic acid probe also comprising sequences such that aprimer complementary to said distinct nucleic acid probe will nothybridize with said first double stranded nucleic acid probe; and ii) aprimer complementary to said distinct nucleic acid probe is used inplace of said primer complementary to said first double stranded nucleicacid probe in step (c), so that selective amplification of thosefeatures which successfully completed the previous cycle of restrictionand ligation occurs.
 19. The method of claim 18 wherein a new distinctnucleic acid probe is used after each cycle of restriction and ligation,said new distinct nucleic acid probe comprising a sequence such that aprimer complementary to that sequence will not hybridize to any probeused in previous cycles.
 20. A method of determining the nucleotidesequence of the features of an array of immobilized nucleic acidscomprising the steps of: a) adding a mixture comprising anoligonucleotide primer and a template-dependent polymerase to an arrayof immobilized nucleic acid features under conditions permittinghybridization of the primer to the immobilized nucleic acids; b) addinga single, fluorescently labeled deoxynucleoside triphosphate to themixture under conditions which permit incorporation of the labeleddeoxynucleotide onto the 3′ end of the primer if it is complementary tothe next adjacent base in the sequence to be determined; c) detectingincorporated label by monitoring fluorescence; d) repeating steps(b)-(c) with each of the remaining three labeled deoxynucleosidetriphosphates in turn; and e) repeating steps (b)-(d) until thenucleotide sequence is determined.
 21. The method of claim 20 whereinthe primer, buffer and polymerase are cast into a polyacrylamide gelbearing the array of immobilized nucleic acids.
 22. The method of claim21 wherein said polyacrylamide gel is macroporous.
 23. The method ofclaim 22 wherein said polyacrylamide gel comprises up to about 25% PEG,about 3% to about 12% total acrylamide and about 1% to about 30% crosslinker.
 24. The method of claim 23 wherein the percentage of said PEG isabout 2.5%.
 25. The method of claim 22 wherein said polyacrylamide gelcomprises DATD.
 26. The method of claim 20 wherein single-strandedbinding protein is present during step (b).
 27. The method of claim 20wherein said single fluorescently labeled deoxynucleotide furthercomprises a mixture of the single deoxynucleoside triphosphate inlabeled and unlabeled forms.
 28. The method of claim 20 wherein afterstep (d) and before step (e) the additional step of photobleaching saidarray is performed.
 29. The method of claim 20 wherein saidfluorescently labeled deoxynucleoside triphosphates are labeled with acleavable linkage to the fluorophore.
 30. The method of claim 29 whereinafter step (d) and before step (e) the additional step of cleaving saidlinkage to the fluorophore is performed.
 31. The method of claim 30wherein said step of cleaving comprises contacting said linkage with areducing agent.
 32. The method of claim 31 wherein said reducing agentis dithiothreitol.
 33. The method of claim 20 wherein saidoligonucleotide primer comprises sequences permitting formation of ahairpin loop.
 34. The method of claim 20 wherein after a predeterminednumber of cycles of steps (b)-(d), a defined regimen of deoxynucleotideand chain-terminating deoxynucleotide analog addition is performed, suchthat out-of-phase molecules are blocked from further extension cycles,said regimen followed by continued cycles of steps (b)-(d) until saidnucleotide sequence is determined.
 35. A method of determining thenucleotide sequence of the features of an array of immobilized nucleicacids comprising the steps of: a) adding a mixture comprising anoligonucleotide primer and a template-dependent polymerase to an arrayof immobilized nucleic acid features under conditions permittinghybridization of the primer to the immobilized nucleic acids; b) addinga first mixture of three unlabeled deoxynucleoside triphosphates underconditions which permit incorporation of deoxynucleotides to the end ofthe primer if they are complementary to the next adjacent base in thesequence to be determined; c) adding a second mixture of three unlabeleddeoxynucleoside triphosphates, said second mixture comprising thedeoxynucleoside triphosphate not included in the mixture of step (b),under conditions which permit incorporation of deoxynucleotides to theend of the primer if they are complementary to the next adjacent base inthe sequence to be determined; d) repeating steps (b)-(c) for apredetermined number of cycles; e) adding a single, fluorescentlylabeled deoxynucleoside triphosphate to the mixture under conditionswhich permit incorporation of the labeled deoxynucleotide onto the 3′terminus of the primer if it is complementary to the next adjacent basein the sequence to be determined; f) detecting incorporated label bymonitoring fluorescence; g) repeating steps (e)-(f), with each of theremaining three labeled deoxynucleoside triphosphates in turn; and h)repeating steps (e)-(g) until the nucleotide sequence is determined. 36.The method of claim 35 wherein for said first or second mixtures ofthree unlabeled deoxynucleoside triphosphates, a mixture which comprisesdeoxyguanosine triphosphate further comprises deoxyadenosinetriphosphate.
 37. The method of claim 35 wherein the primer andpolymerase are cast into a polyacrylamide gel bearing the array ofimmobilized nucleic acids.
 38. The method of claim 35 wherein saidsingle fluorescently labeled deoxynucleotide further comprises a mixtureof the single deoxynucleoside triphosphate in labeled and unlabeledforms.
 39. The method of claim 35 wherein after step (g) and before step(h) the additional step of photobleaching said array is performed. 40.The method of claim 35 wherein said fluorescently labeleddeoxynucleoside triphosphates are labeled with a cleavable linkage tothe fluorophore.
 41. The method of claim 40 wherein after step (g) andbefore step (h) the additional step of cleaving said linkage to thefluorophore is performed.
 42. The method of claim 35 wherein saidoligonucleotide primer comprises sequences permitting formation of ahairpin loop.
 43. The method of claim 35 wherein after a predeterminednumber of cycles of steps (e)-(g), a defined regimen of deoxynucleotideand chain-terminating deoxynucleotide analog addition is performed, suchthat out-of-phase molecules are blocked from further extension cycles,said regimen followed by continued cycles of steps (e)-(g) until saidnucleotide sequence of the features of the array is determined.
 44. Amethod of determining the nucleotide sequence of the features of amicro-array of nucleic acid molecules, said method comprising thefollowing steps: a) creating a micro-array of nucleic acid features in alinear arrangement within and along one side of a polyacrylamide gel,said gel further comprising one or more oligonucleotide primers, and atemplate-dependent polymerizing activity; b) amplifying the microarrayof step (a); c) adding a mixture of deoxynucleoside triphosphates, saidmixture comprising each of the four deoxynucleoside triphosphates dATP,dGTP, dCTP and dTTP, said mixture further comprising chain-terminatinganalogs of each of the deoxynucleoside triphosphates dATP, dGTP, dCTPand dTTP, and said chain-terrninating analogs each distinguishablylabeled with a spectrally distinguishable fluorescent moiety; d)incubating said mixture with said micro-array under conditionspermitting extension of said one or more oligonucleotide primers; e)electrophoretically separating the products of said extension withinsaid polyacrylamide gel; and f) determining the nucleotide sequence ofthe features of said micro-array by detecting the fluorescence of theextended, terminated and separated reaction products within the gel. 45.The method of claim 44 wherein said amplifying is performed by PCR. 46.The method of claim 44 wherein said amplifying is performed by anisothermal method.
 47. The method of claim 44 wherein said microarray ofnucleic acid features in a linear arrangement is derived as a replica offeatures arranged on a chromosome.
 48. The method of claim 44 whereinsaid micro-array of nucleic acid features in a linear arrangement isderived as a replica of one linear subset of features on a separate,non-linear micro-array of nucleic acid features.
 49. A method ofsimultaneously amplifying a plurality of nucleic acids, said methodcomprising the steps of: a) creating a micro-array of immobilizedoligonucleotide primers; b) incubating the microarray of step (a) withamplification template and a non-immobilized oligonucleotide primerunder conditions allowing hybridization of said template with saidoligonucleotide primers; c) incubating the hybridized primers andtemplate of step (b) with a DNA polymerase activity, and deoxynucleotidetriphosphates under conditions permitting extension of the primers; d)repeating steps (b) and (c) for a defined number of cycles to yield aplurality of amplified DNA molecules.
 50. The method of claim 49 whereinsaid non-immobilized oligonucleotide primer comprises a pool ofoligonucleotide primers comprised of 5′ and 3′ sequence elements, said5′ sequence element identical in all members of said pool, and said 3′sequence element containing random sequences.
 51. The method of claim 50wherein said 5′ sequence element comprises a restriction endonucleaserecognition sequence.
 52. The method of claim 50 wherein said 5′sequence element comprises a transcriptional promoter sequence.
 53. Themethod of claim 49 wherein said immobilized primers are amplified beforestep (b).
 54. The method of claim 49 wherein said immobilizedoligonucleotide primers are generated from genomic DNA.
 55. The methodof claim 49 wherein the microarray, template, non-immobilized primer,and polymerase are cast in a polyacrylamide gel.
 56. A method of makinga nucleic acid molecule array comprising: a) providing a liquid mixtureof template nucleic acids, at least one oligonucleotide primer, whereinthe at least one oligonucleotide primer includes a linker moiety, andmonomers capable of forming a polymerized gel matrix; b) contacting themixture of step (a) with a solid support, c) forming a polymerized gelmatrix with the linker moiety covalently bound thereto; and d)amplifying the template nucleic acid to generate a nucleic acid moleculearray.
 57. The method of claim 56 wherein the monomers are acrylamideand the polymerized gel matrix is a polyacrylamide gel matrix.
 58. Themethod of claim 56 wherein the template nucleic acid comprises templateDNA, the at least one oligonucleotide primer comprises at least twoamplification primers, and the liquid mixture further comprises atemplate-dependent DNA polymerase.
 59. The method of claim 58 whereinthe template-dependent DNA polymerase comprises Taq DNA polymerase. 60.The method of claim 56 wherein the template nucleic acid comprises avariable sequence and further comprises binding sites for the at leastone oligonucleotide primer, wherein a binding site is located on eachside of the variable sequence.
 61. The method of claim 56 wherein saidtemplate nucleic acid comprises a library.
 62. A method of making aplurality of nucleic acid molecule arrays comprising: a) providing afirst liquid mixture of template nucleic acid, at least oneoligonucleotide primer, wherein the at least one oligonucleotide primerincludes a linker moiety, and monomers capable of forming a polymerizedgel matrix; b) contacting the mixture of step (a) with a solid support,c) forming a first layer of a polymerized gel matrix with the linkermoiety covalently bound thereto, d) providing a second liquid mixture ofat least one oligonucleotide primer and monomers capable of forming apolymerized gel matrix, e) contacting the first layer with the secondliquid mixture, f) forming a second layer of a polymerized gel matrix,g) amplifying the template nucleic acid and transferring amplifiednucleic acid to the second layer, h) removing the second layer; and i)optionally repeating steps d through h.
 63. The method of claim 62wherein the monomers are acrylamide and the polymerized gel matrix is apolyacrylamide gel matrix.
 64. The method of claim 62 wherein thetemplate nucleic acid is template DNA, the at least one oligonucleotideprimer comprises at least two amplification primers, and the firstliquid mixture further comprises a template-dependent DNA polymerase.65. The method of claim 64 wherein the template-dependent DNA polymeraseis Taq DNA polymerase.
 66. The method of claim 62 wherein said templatenucleic acid comprises a variable sequence and further comprises bindingsites for the at least one oligonucleotide primer, wherein a bindingsite is located on each side of the variable sequence.
 67. The method ofclaim 62 wherein the template nucleic acid comprises a library.